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Preface 



Advances in the field of signal processing, nonlinear dynamics, statistics, 
and optimization theory, combined with marked improvement in instrumenta- 
tion and development of computers systems, have made it possible to apply the 
power of mathematics to the task of imderstanding the human brain. This verita- 
ble revolution already has resulted in widespread availability of high resolution 
neuroimaging devices in clinical as well as research settings. 

Breakthroughs in functional imaging are not far behind. Mathematical tech- 
niques developed for the study of complex nonlinear systems and chaos already 
are being used to explore the complex nonlinear dynamics of human brain phys- 
iology. Global optimization is being applied to data mining expeditions in an 
effort to find knowledge in the vast amoimt of information being generated by 
neuroimaging and neurophysiological investigations. These breakthroughs in 
the ability to obtain, store and anal)^e large datasets offer, for the first time, 
exciting opportunities to explore the mechanisms underlying normal brain func- 
tion as well as the affects of diseases such as epilepsy, sleep disorders, movement 
disorders, and cognitive disorders that affect millions of people every year. Ap- 
plication of these powerful tools to the study of die human brain requires, by 
necessity, collaboration among scientists, engineers, neurobiologists and clini- 
cians. Each discipline brings to the table unique knowledge, unique approaches 
to problem solving, and a unique language. 

This book contains refereed invited papers submitted at the conference on 
Quantitative Neurosciences: Models, Algorithms, Diagnosis, and Therapeutic 
Applications held at the University of Florida, on February 5-7, 2003. The 
conference evolved from a growing awareness among acquaintances repre- 
senting a diverse spectrum of scientists, of an increasing need for recurrent 
and ongoing evaluation of the rapidly accumulating knowledge of quantita- 
tive neuroscientific correlates. The success of the conference has been due to 
the remarkable ability of these scientists of markedly different persuasions to 
expedite and stimulate discussion, to facilitate exchange, to break down in- 
terdisciplinary barriers with unifying cross talk. This heterogeneous group 
included engineers, mathematicians, physicists, neurobiologists, neurologists, 
child nemologists, neurophysiologists, neuropsychologists, and students. The 
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papers of these book reflect the topics discussed at the conference. A recur- 
rent theme is evident in this book not only involving cerebral organization, 
but also its underlying neuroanatomical/neuroquantitative/neurophysiological 
substata, methods of investigation, algorithms, and potential therapeutic appli- 
cations. The book is addressed to faculty, graduate students, and researchers in 
neurosciences, biomedical engineering, and statistics. 

The editors want to take this opportunity to express gratitude to Ortho-McNeil 
Pharmaceuticals, Inc., University of Florida College of Medicine, University of 
Florida McKnight Brain Institute, University of Florida Child Research Insti- 
tute, University of Florida College of Engineering, National Institutes of Health, 
and the University of Florida Division of Sponsored Research. 

Many thanks go especially to Dr. Walter Freeman, Professor, University of 
California, Berkeley, Dr. Kirk Frey, Professor, University of Michigan, Ann 
Arbor, and Dr. John Milton, Professor, Department of Neurology, University of 
Chicago for moderating the final discussion. Special thanks and appreciation 
go to Ms. Danielle Becker who helped with the conference and Mr. Bruno H. 
Chiarini for assisting us in the the preparation of the camera ready Latex form 
of this book. Finally, we would like to thank Kluwer Academic Publishers for 
their assistance. 
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‘‘Men ought to know that from the brain and from the brain only arise, 
our pleasures, joys, laughter and sorrows... 

Through it, we think, see, hear and distinguish the ugly from the beautiful, 
the bad from the good, the pleasant from the unpleasant... 

To consciousness the brain is messenger.** 

Hippocrates (c.460-377 BC) 

On the Sacred Disease. 
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Abstract Seizure occurrences seem to be random and unpredictable. However, recent stud- 

ies in epileptic patients suggest that seizures are deterministic rather than random. 
There is growing evidence that seizures develop minutes to hours before clinical 
onset. Our previous studies have shown that quantitative analysis based on chaos 
theory of long-term intracranial electroencephalogram (EEC) recordings may 
enable us to observe the seizure’s development in advance before clinical onset. 
The period of seizure’s development is called a preictal transition period, which is 
characterized by gradual dynamical changes in EEG signals of critical electrode 
sites from asymptomatic interictal state to seizure. Techniques used to detect a 
preictal transition include statistical analysis of EEG signals, optimization tech- 
niques, and nonlinear dynamics. In this paper, we herein present optimization 
techniques, specifically multi-quadratic 0-1 programming, for the selection of 
the cortical sites that are involved with seizure’s development during the preictal 
transition period. The results of this study can be used as a criterion to pre-select 
the critical electrode sites that can be used to predict epileptic seizures. 

Keywords: Multi-Quadratic 0-1 programming, Lyapunov exponents, EEG, Seizure predic- 

tion 



1. Introduction 

In the last decade, time series analysis based on chaos theory and the theory 
of nonlinear dynamics, which are among the most interesting and growing 
research topics, has been applied to time series data with some degree of success. 
The concepts of chaos theory and theory of nonlinear d3mamics have not only 
been useful to analyze specific systems of ordinary differential equations or 
iterated maps, but have also offered new techniques for time series analysis. 
Moreover, a variety of experiments have shown that a recorded time series is 
driven by a deterministic dynamical system with a low dimensional chaotic 
attractor, which is defined as the phase space point or set of points representing 
the various possible steady-state conditions of a system; an equilibrium state 
or group of states to which a dynamical system converges. Thus, the theories 
of chaos and nonlinear dynamics have provided new theoretical and conceptual 
tools that allow us to capture, understand, and link the complex behaviors of 
simple systems together. Characterization and quantification of the dynamics of 
nonlinear time series are also important steps toward understanding the nature of 
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random behavior and may enable us to predict the occurrences of some specific 
events which follow temporal dynamical patterns in the time series. 

In this paper, we are concerned with the problem of predicting target episodes 
of events and discovering temporal patterns in multiple time series governing 
the related target episodes of events. Traditional linear and nonlinear time series 
analysis has been routinely used but did not seem to successfully give insight 
into the characteristic and mechanism of time series because these methods 
are limited by the stationary requirement of the time series and the normal- 
ity and independence requirements of the residuals. These limitations and the 
lack of insight into the characteristic and mechanism of time series for mak- 
ing better event-predictions of traditional time series analysis are resolved by 
the development of new time series data mining concepts, which generalizes 
data mining concepts, dynamical approaches in chaos theory, and optimization 
techniques to the areas of time series analysis. These concepts are used to de- 
velop new techniques for the prediction of the time series arising in real world 
problems (e.g., electroencephalogram (EEG) time series) as well as to conduct 
advanced studies on the subject. The developed techniques use a combination 
of data mining techniques, dynamical approaches, and optimization techniques 
applied to time series data, with the objective of discovering temporal patterns 
in time series and then predicting events of interest. Specifically, this paper 
integrates methods based on chaos theory, statistical analysis and optimization 
techniques to identify complex (nonperiodic, nonlinear, irregular, and chaotic) 
characteristics and predict the onset of a target event from complex real world 
time series. In addition, we also focus on the statistical and optimization prob- 
lems that enable us to detect statistically significant temporal patterns that can 
be used to characterize and predict the onset of target events in the times series. 
Identifying temporal patterns in multiple time series is combinatorial in nature, 
operating with the selection of critical components in the system of interest. 
Therefore, the optimization techniques are developed to improve the perfor- 
mance of prediction in the time series by identifying critical temporal patterns 
related to the target events (e.g., dynamical parameter settings and selecting of 
critical components). For instance, several alternative optimization methods 
for selecting the critical components in the systems are employed and a novel 
combination of methods for determining the optimal parameters can be applied 
to systems with one or more hidden variables, which can be used to reconstmct 
maps or differential equations of the dynamics of the system. Motivated by the 
spinning glass model, the problem of characterizing and identifying temporal 
patterns is ideally suited to 0-1 (two states) problems. 

Herein, we direct our applications to bioengineering problems, particularly 
epilepsy and brain disorders. Epilepsy is among the most common disorders 
of the nervous system and consists of more than 40 clinical syndromes af- 
fecting 50 million people worldwide (approximately 1% of the population). 
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(A) 



(B) 




Figure 1.1. (A) Inferior transverse and (B) lateral views of the brain, illustrating approximate 

depth and subdural electrode placement for EEG recordings are depicted. Subdural electrode 
strips are placed ovet the left orbitofrontal (LOF), right orbitofrontal (ROF), left subtemporal 
(LST), and right subtemporal (RST) cortex. Depth electrodes are placed in the left temporal 
depth (LTD) and right temporal depth (RTD) to record hippocampal activity. 



Epilepsy is characterized by intermittent seizures, that is, intermittent parox- 
ysmal rhythmic electrical discharges within the cerebrum that disrupt normal 
brain fimction. Approximately 25 to 30% of patients receiving medication 
have inadequate seizure control. In other words, about 25% of patients with 
epilepsy have seizures that are resistant (refractory) to medical therapy. There 
is a localized structural change in neuronal circuitry within the cerebrum which 
produces organized quasi-rhythmic discharges in some types of epilepsy (i.e., 
focal or partial epilepsy). These discharges then spread from the region of 
origin (epileptogenic zone) to activate other areas of the cerebral hemisphere. 
Although the macroscopic and microscopic features of the epileptogenic zone 
have been comprehended, the mechanism by which these fixed disturbances 
in local circuitry produce intermittent disturbances of brain function cannot be 
explained and imderstood. The development of the epileptic state can be con- 
sidered as changes in network circuitry of neiurons in the brain. When neuronal 
networks are activated, they produce a change in voltage potential, which can 
be captured by an EEG. These changes are reflected by wriggling lines along 
the time axis in a typical EEG recording. A typical electrode montage for such 
recordings is shown in Figure 1.1. The EEG onset of a typical epileptic seizure 
of a focal origin recorded with this montage is illustrated in Figure 1.2. Fig- 
ures 1.3 and 1.4 show the preictal state and postictal state of a typical epileptic 
seizure, respectively. 
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Figure 1.2. Twenty-second EEG recording of the onset of a typical epileptic seizure obtained 
from 32 electrodes. Each horizontal trace represents the voltage recorded from electrode sites 
listed in the left column (see Figure 1.1 for anatomical location of electrodes). 
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Figure 1.3. Twenty-second EEG recording of the preictal state of a typical epileptic seizure 
obtained from 32 electrodes. Each horizontal trace represents the voltage recorded from electrode 
sites listed in the left column (see Figure 1.1 for anatomical location of electrodes). 
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Figure 1.4. Twenty-second EEG recording of the postical state of a typical epileptic seizure 
obtained from 32 electrodes. Each horizontal trace represents the voltage recorded from electrode 
sites listed in the left column (see Figure 1.1 for anatomical location of electrodes). 

We are specifically interested in the prediction of epileptic seizures, whose 
occurrence seems to be random and unpredictable. In essence, we integrate 
the developed techniques from data mining concepts, d)mamical approaches in 
chaos theory, and optimization techniques as a set of tools used to extract dy- 
namical changes in the EEG time series that precede a seizure. Specifically in 
this framework, studies based on chaos theory of the spatiotemporal dynamics 
in EEC’s from patients with temporal lobe epilepsy demonstrate a pre-ictal tran- 
sition (temporal patterns of dynamical changes in multiple EEG recordings), 
characterized by a progressive convergence (entrainment) of dynamical mea- 
sures (e.g., short-term maximum Lyapimov exponents - STLmax) at specific 
anatomical areas in the neocortex and hippocampus before the seizure onset. 

The organization of the succeeding sections of this chapter is as follows. 
The background and the method used to estimate ST Lmax and and the spa- 
tiotemporal dynamical analysis is described in section 2. In section 3, The multi- 
quadratic 0-1 programming for selection of critical cortical sites is addressed 
as well as the method to test the hypothesis and the results. The conclusions 
and discussed are addressed in the final section 5. 

2. Background 

In the last decade, several quantitative system approaches incorporating sta- 
tistical techniques nonlinear methods based on chaos theory have been success- 
fully used to study epilepsy because the aperiodic and imstable behavior of the 
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epileptic brain is suitable to nonlinear techniques that allow precise tracking 
of the temporal evolution. Our previous studies have shown that seizures are 
deterministic rather than random. Consequently, studies of the spatiotemporal 
dynamics in long-term intracranial EEC’s, from patients with temporal lobe 
epilepsy, demonstrated the predictability of epileptic seizures; that is seizures 
develop minutes to hours before clinical onset. The period of seizure’s devel- 
opment is called a preictal transition period, which is characterized by gradual 
dynamical changes in EEC signals of critical electrode sites approximately 1/2 
to lhourdurationbeforetheictalonset[9, 11,13, 15,23,24]. Dxiring a preictal 
transition period, gradual dynamical changes can be exposed by a progressive 
convergence (entrainment) of dynamical measures (e.g. short-term maximum 
Lyapimov exponents - STLmax) at specific anatomical areas and cortical sites, 
in the neocortex and hippocampus. Another measure we have used in the state 
space created from the EEG at individual electrode sites in the brain, aver- 
age angular frequency (fl), has produced promising results too. The value of 
fi quantifies the average rate of the temporal change in the state of a system 
and is measured in rads/sec. Although the existence of the preictal transition 
period has recently been confirmed and further defined by other investigators 
[4,5, 16, 17,21], the characterization of this spatiotemporal transition is still far 
from complete. For instance, even in the same patient, different set of cortical 
sites may exhibit preictal transition from one seizure to the next. In addition, this 
convergence of the normal sites with the epileptogenic focus (critical cortical 
sites) is reset after each seizure [14]. Therefore, complete or partial postictal 
resetting of preictal transition of the epileptic brain, affects the route to the 
subsequent seizure, contributing to the apparently non-stationary nature of the 
entrainment process. In those studies, however, the critical site selections are 
not trivial but extremely important since most groups of brain sites are irrel- 
evant to the occurrences of the seizures and only certain groups of sites have 
dynamical convergence in the preictal transition. 

Since the brain is a nonstationary system, algorithms used to estimate mea- 
sures of the brain dynamics should be capable of automatically identifying and 
appropriately weighing existing transients in the data. In a chaotic system, or- 
bits originating from similar initial conditions (nearby points in the state space) 
diverge exponentially (expansion process). The rate of divergence is an impor- 
tant aspect of the system dynamics and is reflected in the value of Lyapimov 
exponents and dynamical phase. 

2.1. Estimation of Short Term Largest Lyapunov 
Exponents 

The method we developed for estimation of Short Term Largest Lyapunov 
Exponents (ST Lmax), an estimate of Lmax for nonstationary data, is explained 
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in detail elsewhere [8, 1 0, 26] . Herein we will present only a short description of 
our method. Construction of the embedding phase space from a data segment 
x(t) of duration T is made with the method of delays. The vectors Xi in the 
phase space (see Figure 1.5) are constructed as: 



Xi = {x{ti),x{U + r) . . . x{ti + (p - 1) * r)) (1) 

where r is the selected time lag between the components of each vector in the 
phase space, p is the selected dimension of the embedding phase space, and 
ti G [1, T — (p — l)rj. If we denote by L the estimate of the short term largest 
Lyapimov exponent STLmax then: 



L = 



1 

NaAt 




ISXiAm 



( 2 ) 



with 



«5Xij(0) = X{ti)-X{tj) (3) 

5Xij{At) = X{ti + At)-X{tj + At) (4) 



where 



■ X{ti) is the point of the fiducial trajectory (j>t{X{to)) with t = ti, 
X{to) = (a;(to),.--,a;(to + (p-l)*'r)),andX(tj)isaproperly chosen 
vector adjacent to X{U) in the phase space (see below). 

■ 5Xjj(0) = X{ti) — X{tj) is the displacement vector at U, that is, a 
perturbation of the fiducial orbit at U, and 5Xij{At) = X{ti + At) — 
X {tj + At) is the evolution of this perturbation after time At. 

m ti = to + {i-l)*At and tj = to + U - 1) * At, where i € [1, Na] and 
j G [1,N] with j ^ i. 

■ At is the evolution time for 6Xij, that is, the time one allows 5Xij to 
evolve in the phase space. If the evolution time At is given in sec, then 
L is in bits per second. 

■ to is the initial time point of the fiducial trajectory and coincides with 
the time point of the first data in the data segment of analysis. In the 
estimation of L, for a complete scan of the attractor, to should move 
within [0, At]. 

■ Na is the number of local Lmax ’s that will be estimated within a duration 
T data segment. Therefore, if Dt is the sampling period of the time 
domain data, T = {N — \)Dt = NaAt + (p — l)r. 
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Figure 1.5. Diagram illustrating the estimation of STLmax measures in the state space. The 
fiducial trajectory, the first three local Lyapunov exponents (Li, L2, Ls), is shown 



We computed the STLmax profiles using the method proposed by lasemedis 
et al. [8], which is a modification of the method by Wolf et al. [26]. We call 
the measure short term to distinguish it from those used to study autonomous 
dynamical systems studies. Modification of the Wolf’s algorithm is necessary to 
better estimate of STLmax in small data segments that include transients, such 
as interictal spikes. The modification is primarily in the searching procedure for 
a replacement vector at each point of a fiducial trajectory. For example, in our 
analysis of the EEG, we found that the crucial parameter of the Lmax estimation 
procedure, in order to distinguish between the preictal, the ictal and the postictal 
stages, was not the evolution time At nor the angular separation Vij between 
the evolved displacement vector SXi-ij{At) and the candidate displacement 
vector (5Xij (0) (as it was claimed in Frank et al. [6]). The crucial parameter 
is the adaptive estimation in time and phase space of the magnitude bounds of 
the candidate displacement vector to avoid catastrophic replacements. Results 
from simulation data of known attractors have shown the improvement in the 
estimates of L achieved by using the proposed modifications [8]. In the preictal 
state, depicted in Figure 1.6, one can see a trend of STLmax toward lower 
values over the whole preictal period, with one prominent drop in the value of 
STLmax approximately 24 minutes prior to the seizure (denoted by an asterisk 
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Figure 1.6. Smoothed STLmax profiles over 2 hours derived from an EEG signal recorded 
at RTD2 (patient 1). A seizure (SZ |t 10) started and ended between the two vertical dashed 
lines. The estimation of the Lmax values was made by dividing the signal into non-overlapping 
segments of 10.24 sec each, using p = 7 and r = 20 msec for the phase space reconstruction. 
The smoothing was performed by a 10 point (1.6 minutes) moving average window over the 
generated STLmax profiles. 



in the figure). This preictal drop in STLmax can be explained as an attempt of 
the system toward a new state of less degrees of freedom long before the actual 
seizure [11]. 



2.2. Estimation of Dynamical Phase (Angular Frequency) 

Motivated by the representation of a state as a vector in the state space, we 
have defined the difference in phase between two evolved states X{ti) and 
X{ti + At) as A$i [12], Then, denoting with (A$) the average of the local 
phase differences A$j between the vectors in the state space, we have; 



. iVa 
i=l 



(5) 



where Na is the total number of phase differences estimated from the evolution 
of X{ti) to X{ti + At) in the state space, according to: 



A$i =1 arccos 



X{U) • X{ti + At) 

II II -II X(fi + Af) II 



( 6 ) 
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Figure 1. 7. A typical profile before, during and after an epileptic seizure, estimated from the 
EEG recorded from a site in the epileptogenic hippocampus; the seizure occurred between the 
vertical lines. 

Then, the average angular frequency Cl is: 

jj = ±.A«. (7) 

If At is given in sec, then 0 is given in rad/sec. Thus, while STL^ax 
measures the local stability of the state of the system on average, measures 
how fast a local state of the system changes on average (e.g. dividing A by 2tt, 
the rate of the change of the state of the system is expressed in sec~^ = Hz). 

An example of a typical Ct profile over time is given in Figure 1.7. The 
values are estimated from a 60-minute-long EEG sample recorded from an 
electrode located in the epileptogenic hippocampus. The EEG sample includes 
a 2-minute seizure that occurs in the middle of the recording. The state space 
was reconstructed firom sequential, non-overlapping EEG data segments of 
2048 points (sampling frequency 200 Hz, hence each segment of 10.24 sec in 
duration) withp = 7 andr = 4, as for the estimation of STLmax profiles [12]. 
The preictal, ictal and postictal states correspond to medium, high and lower 
values of Cl respectively. The highest Cl values were observed during the ictal 
period, and higher Cl values were observed during the preictal period than 
during the postictal period. This pattern roughly corresponds to the typical 
observation of higher frequencies in the original EEG signal ictally, and lower 
EEG frequencies postictally. However, these observations can hardly denote a 
long-term warning of an impending seizure. 
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2.3. Spatiotemporal Dynamical Analysis 

Although a great deal is now known about low dimensional chaos, the erratic 
motion of dynamical systems described by a few variables, is understood about 
systems where the number of chaotic degrees of freedom becomes very large. 
Typically such systems show disorder in both space and time and are said to 
exhibit spatiotemporal chaos. Spatiotemporal chaos occurs when the system of 
coupled djmamical systems gives rise to dynamical behavior that exhibits both 
spatial disorder (as in rapid decay of spatial correlations) and temporal disorder 
(as in nonzero Lyapimov exponents). This is an extremely active, and rather 
unsettled area of research. The system under consideration (brain) has a spatial 
extent and, as such, information about the transition of the system towards the 
ictal state should also be included in the interactions of its spatial components. 
The preictal transition, progressive convergence of STLmax profiles, is another 
evidence of spatiotemporal chaos in the brain (shown in Figure 1.8). Having 
estimated the STLmax temporal profiles at individual cortical site, and as the 
brain proceeds towards the ictal state, the temporal evolution of the stability 
of each cortical site is quantified. The spatial dynamics of this transition are 
captured by considering relationship of the STLmax between different cortical 
sites. For example, if a similar transition occurs at different cortical sites, 
the STLmax of the involved sites are expected to converge to similar values 
prior to the transition. We have called such participating sites “critical sites”, 
and such a convergence “dynamical entrainment.” More specifically, in order 
for the dynamical entrainment to have a statistical content, we have allowed a 
period over which the mean of the differences of the STLmax values at two 
sites is estimated. We have used 60 STLmax values (i.e. moving windows 
of approximately of 10 minutes at each electrode site) to test the dynamical 
entrainment at the 0.01 statistical significance level. We employ the T-index 
as a measure of dynamical entrainment of STLmax profiles over time. The 
T-index at time t between electrode sites i and j is defined as: 



where El} is the sample average difference for the STLmax,i - STLmax a 
estimated over a moving window wt{\) defined as: 



where N is the length of the moving window. Then, cn,j{t) is the sample 
standard deviation of the STLmax differences between electrode sites i and 
j within the moving window wt{\). The thus defined T-index follows a t- 
distribution with N-1 degrees of freedom. For the estimation of the Tij{t) 
indices in our data we used N = 60 (i.e., average of 60 differences of ST Lmax 



Tij{t) = Viv X \E{STL, 






i (8) 
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Figure 1.8. Dynamical entrainment of a pair of brain sites between seizure # 9 and # 10 (patient 
1). (A) STLmax values at the normal site ROF4 jqjproach the one at the epileptogenic site 
RTD2 about 35 minutes into the recording. Then, for 70 minutes up to seizure, the two sites 
interact with each other. (B) The T-index profile generated from the STLmax profiles in (A) 
is shown. The a = 0.05 statistical significance level entrainment zone is also shown (dashed 
horizontal line). Entrainment of the two sites occurs when T-index values are within the depicted 
entrainment zone. 



exponents between sites i and j per moving window of approximately 1 0 minute 
duration). Therefore, a two-sided t-test with iV — 1(= 59) degrees of freedom, 
at a statistical significance level a should be used to test the null hypothesis. 
Ho', “brain sites i and j acquire identical STLmax values at time t.” In this 
experiment, we set a = 0.01, the probability of a type I error, or better, the 
probability of falsely rejecting Ho if Ho is true, is 1%. For the T-index to 
pass this test, the (t) value should be within the interval [0,2.662]. In 
Figure 1.8(A), preictal entrainment (long before the occurrence of a seizure) 
and postictal disentrainment (after the occurrence of the seizure) of the STLmax 
profiles at two brain sites is shown. In Figure 1 .8(B), this behavior is quantified 
by the T-index profile between these sites. From this figure, it is clear that 
attempts for a spatiotemporal entrainment between brain sites occur long before 
an epileptic seizure (first attempt about 70 minutes prior to seizure). Postictally, 
this entrainment is reset. 
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Figure 1.9. Angular frequency Q profiles from two left orbitofrontal electrode sites over 3.5 
hours between seizures 13 and 14 (patient 1). The ictal periods of the two seizures are denoted 
by vertical lines 




Figure 1.10. The T-index profile between two electrode sites whose Q profiles are depicted in 
Figure 1.9. The two sites are dynamically entrained 1,75 to 1.5 hours, as well as 1.2 hour prior 
to seizure’s 14 onset. The Ti and T 2 statistical thresholds are represented by the two horizontal 
lines. 
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In Figure 1.9, the profiles at two cortical sites are shown for the interval 
between seizures 13 and 14 in patient 1. For these cortical sites, a remark- 
able feature is observed; a long-term convergence of their Q, profiles prior to 
seizure 14. We have called this convergence “dynamical entrainment” and we 
have quantified it by a T-statistic that provides a comparison between the two 
electrode sites (shown in Figure 1.10). 

3. Method and Results 

In this paper, we developed optimization techniques for electrode site se- 
lection to test the hypothesis that the set of cortical sites that participate in 
the preictal transition before the current seizure is most likely to participate in 
the preictal transition again before the next seizure. The set of participating 
sites is defined as a set of cortical sites with minimum difference in STLmax 
(most converged) prior to the current seizure and then reset after the seizure. 
To select the participating sites, we formulate this problem as a multi-quadratic 
0-1 problem. We tested this hypothesis on the continuous long-term (3 to 12 
days) multichannel intracranial EEG recordings that had been acquired from 
3 patients with medically intractable temporal lobe epilepsy. The recordings 
were obtained as part of a pre-surgical clinical evaluation. They had been ob- 
tained using a Nicolet BMSI 4000 and 5000 recording systems, using a 0.1 Hz 
high-pass and a 70 Hz low-pass filter. Each record included a total of 28 to 32 
intracranial electrodes (8 subdural and 6 hippocampal depth electrodes for each 
cerebral hemisphere). 

In this framework, we first estimate STLmax, which measure of the order 
or disorder of EEG signals recorded from individual electrode sites. Based on 
STLmax, a multi-quadratic 0-1 programming problem is solved to identify par- 
ticipating electrode sites. The probabilities of participating in the next seizure 
preictal transition period of electrode sites selected from optimization problem 
and randomly selected electrode sites are then compared. The results of this 
study can be used as a criterion to pre-select the critical electrode sites that can 
be used to detect the preictal transition before an impending seizure. 

4. Quadratic 0-1 Programming 

In this paper we refer to the Sherrington-Kirkpatric Hamiltonian that de- 
scribes the mean-field theory of the spin glasses where elements are placed on 
the vertices of a regular lattice, the magnetic interactions hold only for nearest 
neighbors and every element has only two states (Ising spin glasses [1-3,7, 1 8]). 
One of the most interesting problems about this model is the determination of 
the minimal energy states (GROUND STATE problem). 

For many years the Ising model has been a powerful tool in studying phase 
transitions in statistical physics. Such an Ising model can be described by a 
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graph G{V, E) having n vertices {vi , . . . , Vn} each edge {i, j) e E having 

a weight (interaction energy) Jy . Each vertex Vi has a magnetic spin variable 
o-j 6 {-1,+1} associated with it. An optimal spin configuration of minimum 
energy is obtained by minimizing the Hamiltonian 

H{a) = - ^ JijCiaj over all a € {-1, +1}". 

This problem is equivalent to the combinatorial problem of quadratic bivalent 
programming [7]. 

Quadratic zero-one programming has been extensively used to study Ising 
spin glass models. This has motivated us to use quadratic 0-1 programming to 
select the critical cortical sites, where each electrode has only two states, and 
to determine the minimal-average T-index state. We formulated this problem 
as a quadratic 0-1 knapsack problem with objective fimction to minimize the 
average T-index (a measure of statistical distance between the mean values of 
STLjnax) among electrode sites and the knapsack constraint to identify the 
number of critical cortical sites. 

Let A be n X n matrix, whose each element Oij represents the T-index 
between electrode i and j within 1 0-minute window before the onset of a seizure. 
Define x = {xi , ..., Xn), where each Xi represents the cortical electrode site i. 
If the cortical site i is selected to be one of the critical electrode sites, then 
Xi = 1; otherwise, x* = 0. 

A quadratic function is defined on i?" by 

min f{x) = x^ Ax, s.t. Xj £ {0, 1}, f = 1, ..., n (9) 

where A is an n x n matrix [19,20]. Throughout this section the following 
notations will be used. 

■ {0, 1}": set of n dimensional 0-1 vectors. 

■ set of n X n dimensional real matrices. 

■ i?”: set of n dimensional real vectors. 

Next, we add a linear constraint, Xj = k, where k is the number of critical 
electrode sites that we want to select. We now consider the following linearly 
constrained quadratic 0-1 problem: 

n 

P : min /(x) = x^Ax, s.t. ^ Xi = k for some k,x e {0, 1}”, A € 

i=l 

( 10 ) 

Problem P can be formulated as a quadratic 0-1 problem of the form as in (8) 
by using an exact penalty. If A = (oy) then let M = S?=i 1%' I] + 1- 
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n 

P : min 5 ( 0 :) = Ax + M(^^Xi — k)^ , s.t. x € {0, l}”,^ € (11) 

i=l 

To solve this problem, we considered 3 computational approaches. In the first 
approach, we solved ( 1 1 ) by applying a branch and bound algorithm with a 
dynamic rule for fixing variables [19, 20]. In the second approach, we use a 
linearization technique to formulate the quadratic integer programming (QIP) 
problem in (10) as an integer programming (IP) problem by introducing a new 
variable for each product of two variables and adding some additional con- 
straints, and then formulated this problem as a linear 0-1 problem. In the third 
approach, we employed the Karush-IChun Tucker optimality conditions of the 
linearly constrained quadratic 0-1 problem in ( 10 ) to formulate this problem 
as a mixed-integer linear programming (MILP) problem. Details of the first 
approach can be found in [19,20]; next, we discuss the second and the third 
approaches. 

4.1. Conventional Linearization Approach 

For each product XjXj , we introduce a new 0-1 variable, Xy = XiXj {i ^ j). 
Note that xu = x? = for Xi € {0, 1}. After linearization, the equivalent IP 
formulation is given by: 



minEE 

i 3 


aij 


Xij 




( 12 ) 


n 

s.t. E^^i 


= 


k, 




(13) 


i 1 

Xij 


< 


Xi, fori,j = l,. 




(14) 


Xij 


< 


Xj, fori,j = l,. 


..,n (i ^ j) 


(15) 


Xi “h Xj 1 


< 


Xij, fori,j = l, 




(16) 



where Xi e { 0 , 1 } and Xy € { 0 , 1 }, i,j = 1 , ...,n. 

The number of 0-1 variables has been increased to O(n^). Although, we 
can apply CPLEX 7.0 to solve problems with n = 30, this approach becomes 
computationally inefficient as n increases. Future technology will require the 
ability to efficiently solve problems with much larger values for n. For instance, 
micro-electrodes will be implanted in the future (n > 1000 ). 
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4.2. KKT Conditions Linearization Approach 

Consider a linearly constrained quadratic problem given by: 

n 

mm.z{x) = Ax, sX. '^Xi = k,X{ > 0,i = I, (17) 

1=1 

We then have the following Karash-Kuhn Tucker conditions: 

-2Ax-{-u.e + y = 0 (18) 

n 

= k (19) 

i=l 

y^x = 0 , ( 20 ) 

where u and y are Lagrangian multipliers. Note that u is a scalar and y is a 

column vector. We add the slack variables w, which is a column vector, to 
(18) and then denote a column vector s = u.e + w. We then have the KKT 
conditions given by: 



—2Ax + y + s = 0 

n 

'^Xi = k 
i=l 

y^x = 0 . 

We can formulate the above KKT conditions as a MILP formulation. The 
objective fimction is to minimize the summation of variables, Sj. Because 
Xi are 0-1 variables, we can replace the last constraint with j/j < nil — Xi), 
for i = 1, ...,n, where = ll-^lloo- We then have the MILP 

formulation given by: 

n 

min 5] 

i=l 
n 

S.t. "b yi 

j=l 

n 

Y^Xi-k 
i=l 

yi - n{l - Xi) 

where Xi € {0, 1} and Sj, yi > 0, for i = \, ..., n. 



Si 

= 0,for i = 1, ...,n 

= 0 ( 21 ) 
< 0,for i = 1, ...,n 
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Next, we want to show the optimality proof of the KKT conditions lineariza- 
tion approach. Consider a quadratic zero-one programming problem, which 
has the form 

min/(a;) = Ax, s.t. Bx > b, (22) 

where .4 is an n x n matrix, whose each element Uy > 0, i,j = 1, . . . , n, 

X E {0, 1}”, B is an m X n matrix, 6 is a constant vector, m and n are some 

integer numbers. 

Let e be a vector of all 1 ’s, i.e. e = (1 , . . . , 1)^. Consider the following two 
problems: 

P\ : min f{x) = x^ Ax, s.t. Bx >b,xE {0, 1}”. 

Pi : min g{s) = e^s, s.t. Ax — y — s = 0, Bx > b, y'^x = 0, a; e 

{0, 1}", yi > 0, Si > 0. 

Let us prove the following theorem. 

Theorem 1 P\ has an optimal solution x^ iff there exist y°, s° such that 
(x°, s°) is an optimal solution of Pi. 

Proof 1 Necessity. Let x^ be an optimal solution of the problem Pi. Since all 
elements ofthe matrix A are nonnegative, it is obvious that 3y,s : y >0, s >0 
such that 



Ax^-y-s = 0, (23) 

= 0. (24) 

Choose y® , from the above defined set of y and s such that e^s° is minimized. 
Then, we prove that (x^, y^,s'^) is an optimal solution of the problem Pi. 

Multiplying (23) by {x^Y, we obtain (x°)^4x° — (x°)^y° — (x°)^s° = 0. 
Note from (24) that (x°)^y° = 0. Hence, we have 

(x°)^4x° = (x°)^s°. (25) 

If we can prove that 

(x°)^s® = e^s°, (26) 

then (x*^, y°, s*^) is an optimal solution of Pi. To prove that (26) holds, it is 
sufficient to show that, for any i i/x® = 0 then s° = 0. We can prove this by 
contradiction. 

Assume that for some z, x° = 0 and s? >0. where (y°, s*^) were chosen to 
minimize e^s®. Define vectors y and s as yi = y^ + s^, Si = 0 and for 
Vj — Vj' Sj = Sj . It easy to check that (x°, y, s) also satisfies (23), (24), and 
e^s < e^s°. This contradicts with the initial assumption that s° and y® were 
chosen to minimize e^s°. 
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Sufficiency. The proof is similar. 

It is easy to see from the complementarity constraint -iffix = 0 that for every 
i, where Xi = 1, we need to have yi = 0; for every i, where Xj = 0, the 
value of yi does not depend on this constraint. Also note from (23) that the 
value of yi is upper bounded by the value of M = [oy I = Pll OO* 

I 

Therefore, we can reformulate Pi as a linear mixed 0-1 programming replacing 
the complementarity constraint y^x = 0 by a linear constraint y < M(e — x). 
As a result we obtain the following formulation: 

Pi : min g{s) = e^s, s.t. Ax — j/ — 5 = 0, Bx > b, y < M{e — x), Si > 
0, J/i > 0, X € {0,1}”. 

From Theorem 1 , we have shown that problems Pi, Pi, and Pi are “equiva- 
lent.” Therefore, QIP formulation in (10) is equivalent to the MILP formulation 
in (21). From (8), Tij(t) = y/N x \E {ST Lmax,i ~ STLma,x,j}\/(^i,j{t), we 
note that every element in T-index matrix A is positive. For this reason, in every 
instance, by solving the MILP problem in (21) we can find the global solution 
to the original QIP problem in (10). Applying CPLEX 7.0, this problem can 
be easily solved with n = 30. In addition, this formulation is computationally 
efficient as n increases because the number of 0-1 variables is 0(n). From 
computational experiments, the above linear mixed integer 0-1 problem is the 
most efficient approach in our application, see Table 1.1 and Figure 1.11. 



Table 1.1. Performance characteristics of two proposed approaches compared with complete 
enumerations 



Number of 
selected electrodes 


KKT Conditions 
Approach 


Linearization 

Approach 


Complete 

Enumerations 


5 (out of 30) 


297 


656 


15 


6 (out of 30) 


406 


735 


78 


7 (out of 30) 


609 


968 


313 


8 (out of 30) 


1797 


2610 


1141 


9 (out of 30) 


2562 


5235 


3578 



4.3. Multi-Quadratic 0-1 Programming 

Our group has shown dynamical resetting of the brain following seizures 
[14,22,25], that is, divergence of STLmax profiles after seizures. Therefore, 
we want to incorporate this finding with our existing critical electrode selection 
problem (QIP problem in (10)). Thus, we have to ensure that the optimal group 
of critical sites shows this divergence by adding one more quadratic constraint 
to the QIP problem in (10). The multi-quadratic integer programming (MQIP) 
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Figure 1.11. Performance characteristics of two proposed approaches compared with complete 
enumerations 



problem is given by: 



min Ax 

s-t. Er=i^i (27) 

x^Bx > Toik{k — 1 ) 



where Xj € {0, 1} Vi G {1, n}. 

Let B be n X n matrix, whose each element bij represents the T-index 
between electrode i and j within 10-minute window after the onset of a seizure. 
Note that the matrix A = (ojj ) is the T-index matrix of brain sites i and j within 
10-minute windows before the onset of a seizure. Ta is the critical value of 
T-index, as previously defined, to reject Ho', “two brain sites acquire identical 
STLmax values within time window wt{X)” 

With one more quadratic constraint, the problem in (27) becomes much 
harder to solve. Note that in the first approach, a branch and boimd algorithm 
with a dynamic rule for fixing variables cannot be applied to solve this problem 
because of the additional quadratic constraint. However, we can modify the MIP 
formulation in (21) from the previous section and reformulate this problem by 
adding one more linearized constraint. The equivalent IP formulation is given 
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by: 

min EE 

T> 3 



n 

s.t. ^Xj = k, 
1=1 



Xij 


< 


Xi, 


fori, j = 1,.. 




Xij 


< 


Xj, 


fori, j = 1,.. 




Xi + Xj -1 


< 


Xij , 


fori,j = 1,. 


..,n (i ^ j) 




> 


Tak{k - 1) 





i 3 



where Xj € {0, 1} and Xij € {0, 1}, i,j = 1, As we mentioned in the 
previous section, the above formulation is not computationally efficient as n 
increases. 

Next, we want to show the optimality proof of the KKT conditions lineariza- 
tion approach, which can solve MQIP problem in (27) optimally. Now consider 
the case when we have a quadratic constraint. Let C be an n x n matrix, whose 
each element Cij >0, i,j = l,...,n. 

Consider the following two problems: 

P 2 : min/(x) = x^Ax, s.t. Bx > b, x^Cx > a, x G {0, 1}", aisapositive 
constant. 

P 2 : minp(s) = e^s, s.t. Ax — y — s = 0, Bx >b,y< M(e — x), Cx — z > 
0, e^z >OL,z< M'x, X e {0, 1}", j/i, Sj, Zi > 0, where M' = UCHoo 
andM = ||A|loo. 

Let us prove the following theorem. 

Theorem 2 P 2 has an optimal solution x° iff there exist s°, 2 ° such that 

(x°, s°, z^) is an optimal solution of P 2 . 

Proof 2 Necessity. From the proof of Theorem 1, it is obvious that we only 
need to show that if x° is an optimal solution of the problem P 2 then there 
exists vector z^ such that every component is nonnegative, i.e. zf > 0, and the 
following constraints are satisfied: 

Cx° - z° > 0, (28) 

e^z° > a, (29) 

z° < M'x°. (30) 

From (30), note that if x^ = 0 then we must have zf = 0. Similar to the 
proof of Theorem 1, we have that 

e^z° = (x°)^z°. 



(31) 



Optimization and Dynamical Systems Approaches to Seizure Prediction 



23 



Since is a real number and every element of the matrix C is nonnegative, 
then for all i, where we have = 1, we can choose 2 :° > 0 such that (Cx°)j = 

zf. Therefore, (28) and (30) are satisfied. 

Multiplying (28) by ,from (31) we obtain that 

{x^fCx^ = {x°fz° = (32) 

and as x^ is an optimal solution of the problem P 2 then (29) is satisfied: 

= {x^fCx^ > OL (33) 

Sufficiency. The proof is similar. 

From Theorem 2, we have shown that solving the MILP in (21) gives us the 
optimal solution, which is the global solution to the QIP problem in (10). In 
that proof, we show that we can also solve the MQIP problem in (27) by solving 
the following MILP formulation. 



n 



minj^ 








(34) 


1=1 










n 










s.t. ^ Xi — k 


= 


0 




(35) 


i=l 










^ dij Xj Si Hi 

[ 


= 


0,fori = 1,. 


..,n 


(36) 


Vi - M(1 - Xi) 


< 


0,for i = 1, . 


..,n 


(37) 


hi — Mxi 


< 


0,forz = 1,. 


..,n 


(38) 


~ ^ ^ dijXj "1" hi 


< 


0, for i = 1, . 


..,n 


(39) 












n 










'^hi 


> 


Tak{k — 1) 




(40) 



i=l 



where Xi 6 {0, 1} and Sj, yi, hi > 0, for i,j = 1, ..., n. 

Applying CPLEX 7.0, this problem can be easily solved with n = 30. This 
formulation is very computationally efficient and is used to solve this quadrati- 
cally constrained quadratic zero-one problem iteratively for the selection after 
every subsequent seizure. In the future, it may be useful for diagnostic purposes 
to implant more electrodes. Although this will increase n, this formulation is 
still applicable because it is computationally efficient. Note that, in the future, 
more seizure characteristics may be discovered. This would require additional 
quadratic and linear constraints, problem formulation technique is still appli- 
cable for solving MQIP problems. 
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Figure 1. 12. Smoothed STLmax profiles of 5 optimal electrode sites over 1 50 minutes includ- 
ing a seizure. The preictal period shows gradual convergence of the STLmax values calculated 
for these critical electrode sites. During the seizure, STLmax values are completely entrained. 
Postictally, the values are disentrained indicating resetting which reverses the preictal entrain- 
ment. 




Figure 1.13. Smoothed STLmax profiles of 5 non-optimal electrode sites over 150 minutes 
including a seizure. Postictal resetting is not observed for these sites. 
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Figure 1.14. (a) Smoothed STLmax profiles of the 5 optimally selected electrodes over time 
(including seizures 14, 15, and 16). The optimal electrodes were selected 10 minutes before 
seizure 15. (b) Smoothed Omax profiles from the same EEG data as in (a). 
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Figure 1.15. (a) Average T-index curve over time from the STLmax profiles in Figure 1.14(a). 

(b) Average T-index curve over time from the Hmoi profiles in Figure 1.14(b). 
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For illustration purposes, the smoothed ( 1 0-minute moving average) STLmax 
and ilmax profiles of the five optimally selected electrodes of seizure 14 and 
15, are shown in Figures 1.14. The optimal electrodes were selected in a 10 
minute interval prior to the second seizure of each set. For each set of seizures, 
STLmax and 0,max profiles clearly converge (entrain) before the second seizure 
and either both or one of them diverge (disentrain) in this seizure’s postictal 
period. The average T-index curves that quantify this preictal entraimnent and 
postictal disentrainment among the selected electrodes for each of the corre- 
sponding 3 sets of seizures are respectively shown in Figures 1.15. The second 
and third sets of seizures were included herein to show that STLmax and 0,max 
measures are not identical in the detection of the entrainment and disentrainment 
transition across epileptic seizures in the same patient. 

Figures 1.16 and 1.17 illustrate the application of the optimization techniques 
to the detection of the preictal transition preceding seizure 10 in patient 1 from 
STLmax profiles. The profiles from 5 electrode sites (k=5), selected with 
the optimization program applied during the 10 minute interval immediately 
preceding the onset of seizure 9 are shown. Figures 1.18 and 1.19 illustrate the 
detection of the preictal transition preceding seizure 14 in patient 1 from Qmax 
profiles. The profiles from 5 electrode sites (k=5), selected with the optimization 
program applied during the 10 minute interval immediately preceding the onset 
of seizure 13 are shown. In Figure 1.16, the estimated values forward in time 
from the 5 selected sites are shown for the entire interval between seizures 9 
and 10. The average T-index over all possible Ty indices among the optimal 5 
sites is plotted over time in Figure 1.17. In Figure 1.18, the estimated values 
forward in time from the 5 selected sites are shown for the entire interval between 
seizures 13 and 14. The average T-index over all possible Tij indices among 
the optimal 5 sites is plotted over time in Figure 1.19. 

Several points are noteworthy. First, the use of optimal sites for the estima- 
tion of and average T-index profiles helps to detect the preictal transition (about 
1 to 0.5 horn before the seizure onset). Second, the detection of the preictal 
transition is more robust than the previous illustration when only 2 sites were 
used (compare Figures 1.9 and 1.10), in the sense that false warnings preced- 
ing the seizure have been eliminated. The need for optimization in selecting 
cortical site that are likely to show the preictal transition prior to an impending 
seizure is clear when we compare the resxUts derived with optimization (e.g. in 
Figures 1.18 and 1.19) to sites selected without the use of optimization (e.g. in 
Figures 1.20 and 1.21). In Figure 1.20, the profiles from 5 randomly selected 
electrode sites (k=5) firom the same 10 minute interval prior to seizure 13 are 
estimated for the same interval between seizures 13 and 14. The corresponding 
average T-index profile is shown in Figure 1.21. It is clear that the preictal 
transition of seizure 14 cannot be reliably detected by analyzing the EEG fi'om 
sites that were not selected using the optimization program. 
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Figure 1.16. Convergence of 5 STLmax profiles from critical cortical sites over 2 hours be- 
tween seizures 9 and 10 (patient 1). The ictal periods of the two seizures are denoted by vertical 
lines. 




Figure 1.17. The T-index profile among 5 critical cortical sites whose STLmax profiles are 
depicted in Figme 1.16. The cortical sites are dynamically entrained approximately 60 minutes 
prior to seizure’s 10 onset. The statistical thresholds are represented by the two horizontal lines. 
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Figure 1.18. Angular frequency Qmax profiles between seizures 13 and 14 (patient 1) of 5 
electrode sites selected by the optimization program during the 10 minute interval prior to the 
onset of seizure 13. 




Figure 1.19. The average T-index profile of the 5 optimal electrode sites whose Omax profiles 
are depicted in Figure 1.18. The 5 sites become and remain dynamically entrained approximately 
0.5 hour prior to the onset of seizure 14. 
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Figure 1.20. Angular frequency Q.max profiles between seizures 13 and 14 (patient 1) of 5 
non-optimally selected electrode sites. 




Figure 1.21. The average T-index profile of the 5 non-optimal electrode sites whose Clmax 
profiles are depicted in Figure 1.20. The 5 sites do not become dynamically entrained between 
seizures 13 and 14. 
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4.4. Results 

Having defined that electrode sites, which participate in preictal transition, 
must be entrained prior to seizures, we h)^othesize that “the electrode sites that 
are most entrained during the current seizure and disentrained after the seizure 
onset should be more likely to be entrained prior to the next seizure than the 
other entrained sites.” Critical electrode sites are those sites, which are most 
entrained prior to seizures and disentrained after the seizure onset. As a result, 
it is possible to predict a seizure if one can identify critical electrode sites in 
advance. To test this hypothesis, we designed an experiment, which compares 
the probability of detecting preictal transition firom any entrained cortical sites 
with the probability of detecting preictal transition from the critical cortical sites. 
In this experiment, testing on 3 patients with 20 seizures, we randomly selected 
5,000 groups of entrained sites, and employ the computational approach, in the 
previous section, to solve multi-quadratic problem (select the critical sites). 

The results show that the probability of detecting preictal transition from the 
critical cortical sites is approximately 83%. When we compare this probability 
with the probability of detecting preictal transition from any entrained sites, 
we obtain P-value < 0.07, which is significant and validates our hypothesis. 
The Histogram of probability of detecting preictal transition from randomly 
selected entrained cortical sites compared with the probability of detecting 
preictal transition from the critical cortical sites is illustrated in Figure 1.22. 

5. Conclusions and Discussion 

In this paper, we are interested in multi-quadratic 0-1 programming problem, 
which is one of the most practical optimization problems. We proposed a new 
computational approach to solve the multi-quadratic 0-1 programming problem. 
In this approach, we have developed a novel linearization technique based on 
Karush-Kuhn Tucker (KKT) optimality conditions. It is well-known that the 
KKT optimality conditions guarantee the global optimality only in the convex 
case. Although the developed technique seems to be heuristic in nature, we have 
proven that this novel technique can guarantee the global optimality with the 
positivity assumption of elements in the quadratic matrices. It is worth noting 
that because of the properties of T-index matrices, the positivity assumption 
always holds. 

While this developed technique solves the multi-quadratic 0-1 programming 
problems with global optimality, it linearizes the problem with the same number 
of 0-1 variables (n) and additional 0{n) number of continuous variables. On 
the other hand, the conventional linearization techniques found in the literature 
linearize the problem with additional O(n^) number of 0-1 variables. This 
makes the problem become much larger and harder to solve. The comparison 
of computational times between the conventional linearization approach (found 
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Figure 1.22. Histogram of probability of detecting preictal transition of randomly selected 
entrained electrode sites 5,000 times compared with the most entrained electrode sites. 



in the literature) and the BCKT conditions linearization approach has shown that 
this developed technique enormously ouq)erforms the conventional approach; 
that is, the new technique solves problems a lot faster than the conventional one 
and consume considerably much less computational resources. 

The results of this study in epilepsy confirm our hypothesis that the set of most 
converged cortical sites during the current seizure and reset after seizure onset is 
more likely to be converged again during the next seizure than other converged 
cortical sites. These results indicate that it may be possible to develop auto- 
mated seizme warning devices for diagnostic and therapeutic purposes. Thus, it 
is possible to predict an impending seizure based on optimization and nonlinear 
dynamics of multichannel intracranial EEG recordings. Prediction is possible 
because, for the vast majority of seizures, the spatiotemporal dynamical fea- 
tures of the preictal transition are sufficiently similar to that of the preceding 
seizure. This similarity makes it possible to identify electrode sites that will 
participate in the next preictal transition, by solving multi-quadratic 0-1 prob- 
lem. Although evidence for the characteristic preictal transition utilized by the 
seizure prediction algorithm employed in this study was first reported by our 
group in 1991 [1 1], further studies were required before a practical seizure pre- 
diction algorithm was feasible. Development of a seizure prediction algorithm 
was complicated because the cortical sites participating in the preictal transi- 
tion varied firom seizure to seizure. This problem was overcome by the use of 
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our proposed approaches to solve multi-quadratic 0-1 problem. Because the 
algorithm selects candidate electrode sites and by analyzing continuous EEG 
recordings of several days of duration, the computational approach to solve the 
optimization problem has to be very efficient. At present, we can efficiently 
solve multi-quadratic 0-1 problem. However, future technology may allow 
physicians to implant thousands of electrode sites, n > 1000, in the brain. This 
procedure will extract more information and allow us to have a more imder- 
standing about the brain. Therefore; to solve this optimization problem with 
n > 1000, we may need computationally fast heuristic approaches in the future. 
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Abstract Epileptic seizures result from the intermittent spatial and temporal summation of 
abnormally discharging neurons [3, 7,20,24]. Complex dynamical interactions 
between brain regions act to recruit and entrain neurons by loss of inhibition and 
synchronization. Long-term continuous 4-channel intracranial electroencephalo- 
graphic (EEG) recordings were obtained from a genetically engineered model of 
generalized epilepsy (n = 3) and littermate controls (n = 3) in order to perform 
nonlinear dynamical analyses of intracranial brain electrical activity. Signal pro- 
cessing techniques included reconstruction of the EEG signal as trajectories in 
a phase space, applying nonlinear indicators on the trajectories (i.e., short-term 
maximum Lyapunov exponent), and statistical index (T-index) for quantifying 
interactions among distant brain sites. Analysis of interictal (seizure-free) and 
seizure prone periods in 2 to 3 week old H218 epileptic knockout mice revealed 
that (1) the brain electrical activity is of higher order during the seizure-prone 
period, and (2) the interaction among brain sites is more active during the seizure- 
prone period. In addition, dynamical analyses do not show significant difference 
between interictal periods in epileptic mice and littermate controls. These re- 
sults suggest that the development of seizures in an animal model of generalized 
epilepsy is determined in part by non-stochastic neural processes. Results further 
suggest that it may be possible to identify the occurrence of seizures in advance 
through dynamical analytic distinction of interictal and seizure-prone periods. 

Keywords; Epilepsy, nonlinear dynamics, Lyapunov exponents, T-index, animal models 



1. Introduction 

Seizure onset results from the spatial and temporal summation of abnormally 
discharging neurons, which act to recruit and entrain other neurons by loss of in- 
hibition and synchronization [3,7,20,24]. Thus, preictal changes should be de- 
tectable dtxring the period of neuronal recruitment and entrainment. Traditional 
linear signal analyses, such as frequency coherence [9], focal spike density 
counts [10], or spectral analyses are not reliable indicators for seizure predic- 
tion. Nonlinear indicators have been shown to undergo predictable changes in 
advance of seizure onset [8, 1 1-15, 18, 19]. These studies suggest that seizures 
represent the spontaneous formation of self-organizing spatial and temporal 
patterns of brain activity. The interictal to ictal state transition denotes a grad- 
ual phase transition from a complex to a less complex (more ordered) state 
during the preictal phase. This preictal transition was detected in human par- 
tial epilepsy using techniques developed for the study of complex non-linear 
systems [1]. 
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We undertook the present study to determine whether nonlinear analyses of 
continuous EEG could distinguish between interictal and preictal states in an 
experimental animal model of generalized epilepsy in which the entire protein 
coding region of the single copy mouse H2 1 8 gene was deleted through homol- 
ogous recombination [17], Southern blot analysis with 3’ and 5’ probes, as 
well as PCR analysis confirmed the appropriate location of the mutation in both 
ES cells and mice following germ line transmission. The appearance and be- 
havior of the newborn H218“/~ mice were indistinguishable from that of their 
H218'*'/'*" littermates. The loss ofH2 18 had no effect on the weight or length of 
the mice throughout development. Necropsy of adult H218''‘/‘*' and H218~/~ 
mice did not reveal any differences in general anatomy of the position, size, 
or appearance of major organs. Likewise, examination of cresyl violet-stained 
sections in embryonic, newborn, andposmatal day (P) 18-35 mice brains did not 
detect any genotype dependent differences. In particular, the H218“/“ mice 
did not display any gross abnormalities reflecting potential defects in neuronal 
proliferation, migration, differentiation or survival. Recurrent and unprovoked 
spike-and-wave discharges (SWDs) occurred beginning at the end of the sec- 
ond postnatal week [17]. SWDs were often accompanied by stereotypical ictal 
behavioral changes, including frozen staring, vibrissae twitching, facial my- 
oclonus, inability to move, and wild running fits [17]. Electroencephalographic 
(EEG) seizures, ictal behavioral manifestations, timing of seizure onset and ter- 
mination, varied little between siblings, litters, and generations. These intrinsic 
characteristics of H218 were deemed as highly suitable in order to conduct our 
dynamical nonlinear analyses of cortical neuronal activity. 

Spatiotemporal information of brain activity was obtained from multi-electro- 
de, continuous, 24-hour recordings in PI 8-25 H218 mice and age-matched lit- 
termate controls. Our objective was to follow the transition toward epileptic 
seizures in H218 mice by reconstructing EEG recordings as trajectories in a 
phase space, applying non-linear indicators on such trajectories, and utilizing 
statistical index to quantify the interactions among different brain sites. An 
important concept when studying the dynamics of a system using nonlinear 
analyses is the reconstruction of the phase or state space. The phase space 
of a dynamical system is a mathematical space with orthogonal coordinate di- 
rections representing each of the variables needed to specify the instantaneous 
state of the system [4]. The phase space reconstruction method uses data in 
order to construct vectors by iterations of a time delay [1]. Here, we use 7 
sequential voltages, with 20 milliseconds delay, in order to generate a point in 
7-dimensional phase space. The process is repeated to generate all the possible 
points within in each 10.24-second epoch at a sampling frequency of 200 Hz. 
These points form an attractor. The Lyapimov exponent is a numerical indicator 
that describes the average rate at which the trajectories of adjacent states in the 
phase space diverge or converge over time [1]. 
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Figure 2.1. Depth electrode placement diagram. Microelectrodes are placed in the right and 
left frontal cortex and right and left hippocampus in order to record continuous brain electrical 
activity. 



Previous studies in epileptic patients revealed that die preictal state is char- 
acterized by a gradual convergence of the short-term maximum Lyapunov ex- 
ponent (SThmax) values recorded from critical electrode sites to a common 
value [19], This convergence has been defined as dynamical entrainment [19]. 
The degree of significance of the djmamical entrainment can be quantified by 
a statistical T-index from the standard pair-T test. In the present study, we at- 
tempt to utilize nonlinear dynamical STLmax measure and statistical T-index to 
investigate the neurodynamical difference between interictal and preictal states 
in epileptic mouse model with generalized seizures. 

2. Materials and Methods 
2.1. The H218 model 

Untimed pregnant mice were housed at the animal facility at the 

University of Florida (Gainesville, FL). The sucking mice were weaned at P17 
and then housed in groups of three animals of the same gender, until the day of 
surgery for electrode implantation. All animals were maintained in a controlled 
environment at 12-hr light, 12-hr dark cycle with lights on at 0600 hr. They 
were given ?d lib access to food and water. 
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2.2. Surgery and video-electrocorticography 

Chronic intracranial microelectrode implantation in iJ218 (n = 3) and wild 
type (n = 3) littermates allowed for continuous video electroencephalographic 
recordings along with simultaneous video-monitoring [17]. P17 P218 mice 
(average body weight for males and females was 19 — 25 g) were anesthetized 
with avertin (1.25% tribromoethanol/ amyl alcohol solution, 0.02 ml/gm, i.p.) 
which lasted for 0.45 — 1 hr, and placed in a Kopf stereotactic frame. The scalp 
was split and all soft tissue loosened from the dorsxim of the skull. Microelec- 
trodes (0.125 mm diameter, platinum; Plastics One, Inc., Roanoke, VA) were 
chronically implanted 1 mm deep into the right and left dorsal hippocampus 
and right and left frontal cortex (See Fig. 2.1). The points of bregma, lambda, 
and the auditory meatus were used for reference. A correction factor for the 
stereotaxic coordinates for mouse brain was calculated according to the regres- 
sion equation F' = Fa — 0.66(61 — 3.8), where F' is the predicted frontal 
coordinate of a particular neural point, bl is bregma, and Fa is the frontal co- 
ordinate of that point given by a mouse brain atlas [21], Additional reference 
and groimd electrodes were also used. All electrodes, intracranial, reference 
and ground, were connected to a molded plastic pedestal (Plastics One, Inc., 
Roanoke, VA) which was secured to the skull by cranioplastic dental cement. 
iJ218~/~ and wild-type controls were operated on, recorded and evaluated in 
parallel. Animals were allowed to recover from anesthesia for 24-hr before 
returning to a standard mouse cage. 

On the day of recordings (P18 — 25), the mice were freely moving in individ- 
ual, warm, Plexiglas chambers (Dragonfly, WA). All animals were maintained 
in a controlled environment at 12-hr light, 12-hr dark cycle with lights on at 
0600 hr in order to minimize circadian variations. They were given ad lib 
access to food and water. A 20-min adaptation period before electrocortigra- 
phy recordings minimized movement artifact. Four-channel monopolar EEG 
recordings were made on a 32-channel digital recorder (BSMI/Nicolet 4000, 
Madison, WI, U.S.A.). Referential, unilateral left and right, bipolar and linked 
monopolar recordings from ipsilateral regions were compared to determine the 
phase relation, amplitudes, and source of the spike and wave discharges [17]. 

EEGs were recorded on-line (digitally) onto high fidelity videotape. Prior 
to storage on the magnetic medium, signals were sampled at 200 Hz, using 
an analog to digital (A/D) converter with 12- bits quantitation, and amplifiers 
with input range of -2.5 to +2.5 mV and frequency range of 0.05 Hz to 70 Hz. 
Subsequently (off-line), the data from the magnetic tapes was transferred to the 
hard disk (maximum capacity 9 Gigabyte), and finally to optical disks (capacity 
1.5 Gigabyte). Computer data files were created that included rodent identifi- 
cation, time and date of each seizure, the classification of each seizure and the 
original EEG data. Mice were able to move freely around a cage containing 
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food and water during the EEG recording sessions. During recordings, animals 
were observed directly by one or two investigators and suspected behavioral 
seizure activity was recorded in a separate log. EEG recordings were compared 
with concomitant animal behavior by using a split screen monitor display of the 
video-EEG record. Video records were reviewed daily for evidence of SWDs 
and behavioral seizures. 

All animal procedures were reviewed and approved by the University of 
Florida Institutional Animal Care and Use Committee. 

2.3. Seizures in H218 mice 

P18— 25,if218 mice manifested behavioral seizures that consisted of frozen 
staring, vibrissae twitching, facial myoclonus, inability to move, and less fre- 
quently, wild-running^ouncing fits. In the present study, 1 68 behavioral seizures 
were time-locked with SWDs in P18 — 25, P’218 mice (n = 3). The most 
severe seizures were succeeded by wild-running, boimcing fits followed by be- 
havioral and EEG depression. Eight mnning/bouncing seizures were identified 
in P218 mice between P18 — 25 (4.2 running, bouncing fits/P218 mouse). 

EEGs of H2 1 8 mice had relatively normal backgrounds during maximal alert- 
ness. During periods of behavioral quiescence generalized sharp waves were 
often seen. Obvious seizure behavior (i.e., frozen staring, vibrissae twitching, 
facial myoclonus, inability to move, and less frequently, wild-running/bouncing 
fits) was always accompanied by SWDs. In some instances, SWDs were ob- 
served in the absence of overt ictal behavior. Seizures often occurred in clusters 
over a 0.5-hr period. Individual seizure duration varied from 7 to 20-sec. Av- 
erage seizure duration was 12-sec. A circadian pattern to seizure occurrence 
was not observed in this limited cohort. 

Electrographic traces of seizure episodes often began with the synchronous 
buildup of 4-5 Hz, 30-60 amplitude SWDs. During the behavioral ictus, 
bilateral 5 -6 Hz, 60- 1 00 /i U SWDs were noted. An example EEG trace recorded 
during a motor seizure from one P218 animal is shown in 2.2a,b,c. During 
behavioral sleep, individual spikes and runs of 2 to 3 second bilateral spiking 
were seen. SWDs usually disappeared on arousal. Wild-type mice did not 
exhibit overt seizures though occasional behavioral arrests were seen in 2 mice. 
No EEG abnormalities were identified during wakefulness and sleep in wild- 
type littermates (see Fig. 2.2d). 

2.4. Short-term maximum Lyapunov exponents 

We utilized an estimate of the Short-Term Maximum Lyapimov exponent 
(STLmax) as the dynamical measure of the electroencephalogram. Estimation 
of STLmax was calculated by dividing the EEG signal into non-overlapping 
segments of 10.24-sec each. The largest Lyapunov exponent {Lmax or L\) is 
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I 

Figure 2.2. Electroencephalographic (EEG) recordings: (a) Intermittent, high amplitude, asyn- 
chronous and asymmetric, bilateral, polyspike-wave discharges in a 21 postnatal day H218“/“ 
mouse during a myoclonic seizure, (b) The first 20 sec of the trace in (a) is shown at a slower 
time scale to better illustrate the polyspike spike/wave characteristics of the discharges, (c) 
Continuous, high amplitude, bilateral, polyspike-wave discharges during a wild running episode 
(overlying line) by a H218-/- mouse, (d) Awake H218“‘"/“^ control mouse recording obtained 
using identical procedures to that of (a)-(c). Data shown are all bipolar recordings from the 
right (R) and left (L) frontal cortex. Similar, but less robust, results were obtained with bilateral 
hippocampal electrodes. Scale bar: (a), 4 sec, 30 mV; (b)-(d), 1 sec, 35 mV. (From [17]). 



defined as the average of local Lyapunov exponents Lij in the state space, that 
is: 



L = 






Na 



( 1 ) 



where Na is the total number of the local Lyapimov exponents that are estimated 
from the evolution of adjacent points (vectors) in the state space, Xi = X{ti), 
Xj = X{tj), and 




|My(A«)| 



( 2 ) 



where At is the evolution time allowed for the vector difference |(5Xij(At) | to 
evolve to the new difference |5Xij(0)| where 



5Xi^j{Q) = X{ti)-X{tj) 

5Xij{M) = X(ti + At) -X(tj + At). 



If At is given in sec, then Lmax 



will be in bits/sec. 
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2.5. Statistical T-index 

We used the T-index (from the statistical paired-T test) to measure the de- 
gree of entrainment between critical electrode sites. The T-index of a pair of 
electrodes was calculated in each 5-min epoch (30 STLmax segments) by di- 
viding the mean difference of STLmax between the two electrode sites by its 
standard deviation. That is, the T-index at time t between electrode sites i and 
j is defined as: 



where £'{•} is the sample average difference for the STLmax,i — STLmaxj 
estimated over a moving window wt{\) defined as: 



where N is the length of the moving window. Then, <Tjj(t) is the sample 
standard deviation of the STLmax differences between electrode sites i and j 
within the moving window wt{\). 

2.6. Statistical Analyses 

STLmax ^>nd T-index were calculated over time in the 4-channel EEG record- 
ings (left and right frontal cortex and left and right hippocampus) from three 
epileptic mice and three littermate controls. One interictal period (at least 8 
hours away from a seizure) and one seizure prone period (at least 3 seizures) 
from each epileptic mouse were analyzed. In one epileptic mouse, 6-hours of 
continuous EEG recording during an interictal period, and a 2.7-hour contin- 
uous EEG recording during a seizure-prone period with 7 severe seizures was 
analyzed. In a second epileptic mouse, 2.9-hours of continuous EEG recording 
during an interictal period, and a 5.4-hours of continuous EEG recording in a 
seizure-prone period with 7 severe seizures was analyzed. In a third epileptic 
mouse, 3.6-hours of continuous EEG recording during an interictal period, and 
a 6-hours of continuous EEG recording during a seizure-prone period with 7 
severe seizures was analyzed. These data were compared with age-matched 
littermate controls with 6 hours of continuous EEG recordings. 

The difference of STLmax values between interictal and preictal periods 
in epileptic mice was evaluated by the two-way ANOVA with replicates test. 
Thirty random samples of STLmax values in each period were used in the test. 
The same analysis was also performed to compare the dynamical difference 
between interictal periods in H218 epileptic mice and littermate controls. Fur- 
ther, we compared the degree of dynamical entrainment between interictal and 
preictal states in H218 mice by their average T-index curves. We first defined 
the significant entrainment (SE) a a T-index value less than a critical value from 



— y/N X \E{STLmax,i STLmax^}\/^i,j{i) (3) 
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the T-distribution with significance level = 0.05. The probabilities of significant 
entrainment for all periods were then estimated and compared by calculating 
the proportions of observed SE. 

3. Results 

Figures 2.3 and 2.4 show the STLmax 5-point smoothed curves of 4 elec- 
trodes during the interictal and preictal periods in an epileptic mouse. Figure 
2.5 shows the STLmax 5-point smoothed curves of 4 electrodes for a littermate 
control. The results of the test between interictal and preictal periods in three 
epileptic mice showed that, for all electrodes, the STLmax values are signif- 
icantly lower during the preictal periods (p < 0.01). In the comparisons of 
the interictal periods with the littermate controls, the test showed no significant 
difference in frontal cortex with respect to their STLmax values (p > 0.05). 
However, the STLmax values from the hippocampus are significantly lower in 
controls (p < 0.01). 

Figures 2.6 show the average T-index curves during the interictal and seizure- 
prone periods from the same epileptic mouse in Figures 2.3 and 2.4, and the 
same littermate control in Figure 2.5. From these three T-index curves, it 
is obvious that the degree of entrainment is larger during the seizure-prone 
period. The combined paired comparison over three epileptic mice showed that 
the probabilities of significant entrainment (SE) during the seizure prone periods 
are approximately 12% higher than during the interictal periods. Fiuther, the 
probability of SE in littermate control is only approximately 3% lower than 
the one during the interictal period in epileptic mice. However, the analysis 
of comparisons from more mice is required in order to show the statistical 
significance. 

4. Discussion 

The present findings indicate that it is possible to anticipate a seizure vulner- 
able period in a genetically engineered animal model of generalized epilepsy. 
These results are based on the dynamical nonlinear time-series analyses of con- 
tinuous brain electrical activity collected over several days in H218 mice. By 
implication, the resrfits suggest that development and resolution of seizures is 
determined by non-stochastic processes and that it is possible to anticipate the 
occurrence of seizures in advance. Similar results have been reported in human 
temporal lobe epilepsy [8, 14, 15, 18]. The analysis of the spatial and temporal 
pattern dynamics of long-term intracranial EEGs, recorded for clinical purposes 
in patients with medically intractable temporal lobe epilepsy, has demonstrated 
that seizures are preceded by dynamical changes that are detectable within 
30 minutes to 1 hour before the seizure onset [11, 12, 14]. These investiga- 
tions utilize the STLmax to quantify the rate of convergence or divergence of 
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Figure 2.3, Five-point smoothed STLmax profiles over 2.5 hours for right frontal, left frontal, 
right hippocampus, and left hippocampus in a H218“/” mouse during the interictal (between 
seizure) period. The dashed horizontal lines indicate the mean ST Lmax values over the period. 
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Figure 2. 4. Five-point smoothed STLmax profiles over 5 .4 hours for right frontal, left frontal, 
right hippocampus, and left hippocampus in a H21 mouse during the seizure-prone period. 
The dashed horizontal lines indicate the mean STLmax values over the period. 
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Figure 2.5. Five-point smoothed 5TLmaa: profiles over 6 hours for right frontal, left frontal, 
right hippocampus, and left hippocampus in a littermate control mouse. The dashed horizontal 
lines indicate the mean 5TI/mai values over the period. 





Figure 2. 6. Representative example of a T-index curve among 4 electrode sites over time from 
interictal and seizure-prone periods in H21 and an H21 littermate control. The dashed 
horizontal line indicates the critical value of T-index. 
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neighborhood trajectories in an attractor. The preictal transition in a sample of 
temporal lobe seizures was characterized by a gradual convergence of STLmax 
values recorded from critical electrode sites to a common value. 

What may account for the observed spatiotemporal dynamical changes that 
occm before seizure onset? One possibility is that seizures are preceded by 
physiological changes that are reflected in the d 3 mamical characteristics of the 
EEG signals. The short-term Lyapimov exponent is a direct measure of the 
degree of order within the signal. Our findings suggest that brain dynamics are 
less ordered in H218 epileptic mice as compared to age-matched controls, and 
that the brain electrical activity becomes increasingly more ordered immedi- 
ately before the seizure. For instance, STLmax and T-index values are lower 
in H2 1 8 mice as compared to age-matched controls. These results suggest that 
the nem-odynamics of H218 epileptic mice are potentially more conducive for 
seizures. Seizure onset which results from the spatial and temporal summation 
of abnormally discharging neurons act to recruit and entrain other neurons by 
loss ofinhibition and synchronization [3,7,20,24]. Our quantitative results sup- 
port the hypothesis that the nonlinear dynamical changes are likely to represent 
the preictal period of neuronal recruitment and entrainment. 

What are the implications of these observations? These results suggest that 
nonlinear quantitative analysis of broad regions of brain structures may be a 
more sensitive method to detect alterations in the behavior of the network before 
the more traditional seizure discharge is seen, and allow prediction of seizures 
before expression behaviorally or on traditional EEG. These results are also con- 
sistent with the suggestion that epilepsy is a disorder of large neural networks 
and that electrical hyperexcitability associated with seizure activity reverber- 
ates within the neural structures of the network, which operate together and 
inextricably to culminate in the eventual expression of a seizure by entrainment 
of this large neural network from any given part of the brain [22]. Interest- 
ingly, our results demonstrate that in the generalized-onset seizure model and 
human partial-onset seizures [1 1, 12, 14], seizures could be anticipated by sev- 
eral minutes by automated nonlinear analysis of the dynamical characteristics 
of intracranial EEG recordings. The similarities may reflect the existence of 
specific cortical and sub cortical networks in the genesis and expression of 
partial and generalized seizures [2, 5, 6, 16, 23], which in turn are expressed in 
the quantitative nonlinear values of the EEG. The present study represents a 
proof-of-concept preclinical effort which is a necessary initial step in the devel- 
opment of a seizure prediction device for humans. A seizure warning system 
could be incorporated into a digital signal processing chip for use in implantable 
devices. Such devices could be utilized to activate pharmacological or phys- 
iological therapeutic interventions designed to prevent an impending seizure. 
Future animal studies, employing novel experimental designs and sensitivity 
and specificity studies will be required to investigate the therapeutic potential 
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for implantable seizure warning devices. The results also indicate that these 
methods may serve to dilferentiate seizure vulnerability in susceptible subjects 
well in advance of the onset of epilepsy. 
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Abstract We present a survey and generalizations of some methods for ICA like maximiza- 

tion of kurtosis and algebraic cumulant methods for a combination of second and 
fourth order statistics by joint diagonalization of covariance and cumulant ma- 
trices depending on time delays. We describe an experiment with EEG data 
showing that the combination of second and fourth order statistics gives better 
results for detecting of eye movements. 

Keywords: Independent Component Analysis, fixed point algorithm, cumulants methods, 

eye movement. 



1. Introduction 

The problem of Independent Component Analysis (ICA) and blind source 
separation (BSS) has received wide attention in various fields such as biomed- 
ical signal analysis and processing (EEG, MEG, fMRI), speech enhancement. 
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geophysical data processing, data mining, wireless communications, image 
processing, etc. The literature about ICA and BSE problem is huge (see for 
instance [6], [12] and references therein). 

The problem is formulated as follows : we can observe sensor signals (random 
variables) x (k) = [xi (A:) , . . . , (A:)]^ which are described as 

x{k) = As{k) A: = 1 , 2 ,... (1) 

where s (A:) = [si (A:) , . . . , (A:)]^ is a vector of unknown source signals and 

A is n X n non-singular unknown mixing matrix. 

Our objective is to estimate the source signals sequentially one-by-one or 
simultaneously assuming that they are statistically independent. 

In this article we present a survey and generalization of some methods of 
ICA as maximization of kurtosis and some algebraic cumulant methods, and an 
experiments with EEG data confirming that combination of second and fourth 
order statistics gives better results in the task for extraction or estimation of 
components related to eye movements. More precisely, we give a mathemati- 
cal explanation why the of maximization of the absolute value of the kurtosis 
gives independent components and present a generalization of the algorithm of 
Hyvarinen and Oja [13] for high order ciunulants, which has high order order 
convergence rate. We present also a novel ICA algorithm combining second and 
fourth order statistics, based on the approximate joint diagonalization method 
(see [3]) for covariance and cumulants matrices with time delays and apply it 
for detection of eye movements. 

2. Extraction via maximization of the absolute value of 
the cumulants 

Maximization of nongaussianity is one of the basic ICA estimation principles 
(see [6], [12]). This principe is explained by the central limit theorem, according 
to which, sums of nongaussian random variables are closer to gaussian than the 
original ones. Therefore, a linear combination y = w^x = WiXi of the 
observed mixture variables (which is a linear combination of the independent 
components as well, because of the linear mixing model) will be maximally 
nongaussian, if it equals one of the independent components. Below we give 
rigorous mathematical proof of this statement. The task how to find such a 
vector w, which gives one independent component, and therefore should be 
one (scaled) row of the inverse of the mixing matrix A, is the main task of the 
(sequential) ICA. We wdll describe an optimization problem for this task. 

Define the function ip : IR” IR by 



9 ?p(w) = cump(w^x) 



Optimization Techniques for ICA with Applications to EEC 
where cump means the self-cumulant of order p: 
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cump(s) = cumulant(s, s) 



(see [ 1 6] for definition and properties of the cumulants). The following property 
of the cumulants is used essentially in derivation of fixed point algorithm [13] 
and its generalization below: if = 1, ...,n are statistically independent, 
then 

( n \ n 

E 

i=i / i=i 

Consider the maximization problems 



OP(p) maximize |¥’p(w)| 

under constraint ||w|| = 1. 

and 

DP(p) maximize |^p(c)| 

imder constraint ||c|| = 1, 

where V’p(c) = cump ^ CjSi j . 

Denoting y = w^x, c = A^w we have 

n 

y = c^s = Y^ CiSi 
i=\ 



and 

Pp(w) ='0p(A^w). (3) 

Without loss of generality we may assume that the matrix A is orthogonal 
(assuming that we have made the well know preprocessing called "prewhiten- 
ing", see [6] or [12]). 

It is easy to see (using (3) and the orthogonality of A) that the problems 
DP(p) and OP(p) are equivalent in sense that w* is a solution of OP(p) if and 
only if c* = A^w* is a solution of DP(p). 

A very useful observation is the following: if a vector c contains only one 
nonzero component, say = ±1, then the vector w = Ac gives extraction 
(say y{k)) of the source with index io, since 

y{k) := w^x(fc) 

= c^A^x(A:) 

= c^s{k) = Sio{k) V/c = l,2,.... 



( 4 ) 
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By the following lemma we show that the solutions c of DP(p) have exactly 
one nonzero element. So, we can obtain the vectors w = Ac as solutions of 
the original problem OP(p), and by (4) we achieve extraction of one source. 

One interesting property of the optimization problem OP(p) is that it has 
exactly n solutions (up to sign) which are orthonormal and any of them gives 
extraction of one source signal. The fixed point algorithm [13] finds one by one 
its solutions. 

We note that the idea of maximizing of cum 4 (w^x) in order to extract one 
source fi"om a linear mixture is already considered in [8]. 

The next lemma gives a mathematical explanation of maximization of kur- 
tosis method (applied by many authors) and its generalization, proposed here: 
maximization of the absolute value of even order cumulants. 

Lemma 1 Consider the optimization problem: 

n 

minimize (maximize) /(v) = 

i=l 

subject to ||v|| = 1, 

where p > 2is even and v = [ui, ..., Vnf^- Denote 

I'^ = {ie {1, ...,n} : fci > 0} 

I~ = {i e {1, ..., n} :ki < 0} 

and Bi = (0, ...0, 1,0, ..., 0), (1 is the i—th place). 

Then the points of local minimum are exactly the vectors ±6j, i E I~ and 
the points of local maximum are exactly the vectors ±ej,j E I^. 

Proof. Applying the Lagrange multipliers theorem for a point of a local 
optimum v = (Ui, ...,Vn), we write: 

kipxF^~^ - 2\vi = 0, i = 1, ..., n, (5) 

where A is a Lagrange multiplier. 

Multiplying (5) by Vi and summing, we obtain: 

pfopt. — 2A, (6) 

where fopt. means the value of / at the local optimum. From (5) and (6) we 
obtain 

Vi{kiplf~’^ - pfopt.) = 0. 

Hence Uj = 0 if fej and fopt. have different signs, and 



( 7 ) 
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which is the first order optimality condition for the initial problem. 

Case 1. Assume that ki^ < 0 for some index io and v is a local mini- 
mum. Then obviously fioc.min. < 0. According to the second order sufficient 
optimality condition [ 1 ], a point x° is a local minimiun if 

h^L"(x°)h>0 Vh€A:(x°), h 0, (9) 



where 

K{x^) = {h ; h^x° = 0 } 
is the tangent space to the constraint set at x° and 

LW = ^fc,x?-A(||xf-l) 

1=1 

is the Lagrange function. The second order necessary condition [1] states that 
if x° is a local minimum, then 

h^L"(x°)h>0 VhGit:(x°). (10) 

In our case, by ( 6 ) and ( 8 ) we obtain 



h^L"(v)h = 5](p(p-l)fciur'-2A)h2 (11) 

— P fioc.min. |^(P “ 2) ^ ^ ^ /ij j , 

iel i^I 

where I is the set of those indexes i, for which Vi is different fi-om 0. 

We will check the second order sufficient condition (9) for a local minimum 
for the points (the first order optimality conditions ( 8 ) is obviously satisfied 
for e,t). We have 

K{ef^) = {h:hi, = 0}, 
therefore, for h € K{e^), h 7 ^ 0 we have 

h’’i"(e±)h > 0, 

since hi^ = 0 and fioc.min. < 0, i.e. the second order sufficient condition (9) 
is satisfied, therefore is a local minimum. 

By (1 1) it follows that for any vector v' with at least two nonzero elements, 
say and v'^, the quadratic form h^X"(v')h can take negative values for some 
vectors h € AT(v'). For instance, taking the vector h' with io-th component 
equal to u' , g-th component equal to —v'^^ and the rest equal to zero, we see 



58 



QUANTITATIVE NEUROSCIENCE 



that h' G K{-v'), but h'^L"(v')h' < 0. Therefore, the second order necessary 
condition (10) for a local minimum is not satisfied for any vector with more 
than one non-zero elements. 

Case 2. Assume that kj > 0 for some index j and v is a local maximum. 
We apply Case 1 to the function — / and finish the proof ■ 



3. A generalization of the fixed point algorithm 

Consider the following algorithm; 



- 1 )) 

= H(w(,-i))ii' 



( = 1 , 2 ,..., 



( 12 ) 



which is a generalization of the fixed point algorithm of Hyvarinen and Oja. 
The name is derived by the Lagrange equation for the optimization problem 
OP(p), since (12) tries to find a solution of it iteratively, and this solution is a 
fixed point of the operator defined by the right-hand side of (12). 

The next theorem gives precise conditions for convergence of the fixed point 
algorithm of Hyvarinen and Oja and its generalization (12) (for a proof, see 

[9]). 



Theorem 1 Assume that Si are statistically independent, zero mean signals and 
the mixing matrix A is orthogonal. Letp > Abe a given even integer number, 
cump(si) ^ 0,i = 1 , ..., n and let 



7(c) 



arg max a 

Ki<n 



cump{si) 



1 

p-2 



Denote by Wo the set of all elements w G IR" such that ||w|| = 1, the set 
7(A^w) contains only one element, say i(w), and Cj(^) 0, a j c = A^w. 

Then 

(a) The complement of Wo has a measure zero. 

(b) i/’w(0) G Wo then 



lira yi{k) = ±Sio{k) VA: = 1, 2, ..., 

i— >oo 



where yi{k) = w(/)^x(fc) and io = t(w(0)). 

(c) The rate of convergence in (b) is of order p — 1. 



4. Examples and remarks for practical implementation 

Below we consider some examples, for concrete values of p. 

1) p = 4. Then 

cp4(w) = cum4(w^x) = E{(w^x)'*} - 3(£J{(w^x)^})^ 
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and 

<^4(w) = 4E{(w^x)^x} — 12E{(w^x)^}E{(xx^)}w. 

We note that if the standard prewhitening is made (i.e. 

J5{xx^} = In, A is orthogonal), the algorithm (12) recovers the fixed-point 
algorithm of Hyvarinen and Oja, i.e. 

E{{w{lfxfx)-3w(l) 

||£{(w(i)J'x)3x}-3w(i)|| 

2) p = 6. Then 

(pe(w) = E{(w^x)®} - 15 £;{(w^x)2}£;{(w^x)^} 

-10{E{{vt'^xf})^ + 30 (£;{( w ^ x ) 2 }) 3 , 

— 6E{(w^x)^x) — 30E{xx^}w£{(w^x)^} 

— 60E{(w^x)^}£{(w^x)^x} 

-60E{ (w^x)3}E{ (w^x)2x> 
-t-180(£{(w^x)^})^£?{xx^}w. 

If, in addition, E{xx^} = I, then 

~ 6E{(-w^x)^xj - 30wE{(w^x)‘^} 
-60£{(w^x)^x} 

-60E{(w^x)3}f;{(w^x)2x} 

-|-180w. 

We have implemented the algorithm (12) in the case of six order cumulants, 
using a deflation procedure. We generated 50 sparse source signals and run 
the algorithm after random mixture. The number of iterations for extracting 
all signals one by one using six order cumulants was 313, and the number of 
iterations using fourth order cumulants was 339. But the computational time 
is bigger in the case of six order cumulants. This can be explained by the 
computational time needed for calculation of the six order cumulants, since 
they have more complex structure. 

We should mention also that the successful extraction of components depends 
of their statistical independence. In case when they are not so independent, the 
extraction is problematic and depends on the initial condition. 

5. Combining second and fourth order statistics 

In this section we consider an unified model of source signals and additive 
noise, which is white of order 2 and 4. We assume that all source signals are 
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uncorrelated of order 2 and 4, as some of the source signals (we don’t know 
which) are white of order 4 but colored of order 2 (for instance colored Gaussian 
signals) and the rest are white of order 2 and colored of order 4. We introduce 
a new sufficient condition for separation (see condition DCF(P) below) stating 
that the sources have different autocorrelation functions or different cumulant 
functions of fourth order over a given set P of time delays. This condition can 
be considered as a generalization of those ones described in [5] and [17] (for 
second order statistics) and used in [4]. 

The second and fourth order statistics is used in [11] and [14] by a joint 
diagonalization procedure of covariance and cumulant matrices, combining 
SOBI and JADE algorithms described in [2] and [3] respectively. Our idea 
below is to use cumulant matrices involving time delays. 

Define a covariance matrix of the sensor (resp. source) signals by 

Rx(p) = £^{xxj’}, (resp. Rs(p) = E{ss^}), (13) 

where E is the mathematical expectation, Xp = x{k — p),'K = x(fc),Sp = 
s{k — p),s = s{k), and the symmetric matrix Rx(p) by 

Rx(p) = ^(Rx(p) + Rx(pf). 



Define a fourth order cumulant matrix Cx.xp of the sensor signals by 

Cx,xp = -S{xx^(xjxp)} - £{xx^}tr£;{xpxj} - 2f;{xxJ}£;{XpX^}, 

~ 2 2 

and the symmetric matrix Cx’.xp by 



p2,2 = l(p2,2 I /'pi2,2 \T\ 

'-^X,Xp n\^X.,Xp ~ \^X,XpJ )• 



Note that Cx, xp (resp. Rx (p)) is symmetric, if Cs.’sp (resp. Rg (p) ) is a diagonal 
matrix, but in order to avoid the effect of computational errors (which could 

destroy the symmetricity), we use Cx,xp (resp. Rx(p)). 

2 2 

It is easy to see that the {i, j)-th element of Cx’,xp is 



C'Jxp (i. J) = XI cum{xi(fc) , Xj (fc) ,xi{k-p), xi {k-p)} 

1=1 

(see [3] for more general cumulant matrices). 

Similarly we define analogous matrices Cs,’sp and Cs,sp for the source signals 
s(fc). Recall that a signal s is white of order 2 (resp. white of order 4) if 



E{s{k)s{k - p)} = 0, Vp>l 
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(resp. cum{s(fc - pi), s{k - P 2 ), s{k - ps), s{k - p^)} = 0 

for every Pi > l,i = l,...,4)(see [15]). 

In a linear data model (1), if the noise n is white of order 4, and the mix- 
ing matrix A is orthogonal, then the time-delayed cumulant matrices of the 
observation vector x(fc) for any p ^ 0 satisfy 



The representations (14) and (15) gives the idea how we can achieve sep- 
aration by combination of fourth and second order statistics: we jointly di- 
agonalize matrices of type Cx,xi for i = 1, 9 and matrices of type Rx(j) 
for j = 1, ...,p. This is our algorithm called SFOBLpq, which we use in the 
sequel. 

In order this procedure to be successful, the following condition, called 
DCF(P) (different cumulant functions) must be satisfied: 



i.e. the sources have different autocorrelation or cumulant functions of fourth 
order on a given set P = {pi , Pi} of time delays. For a detailed description 
of this condition and its necessity, see [ 10 ]. 

6. Experiment with EEG data for detection of eye 
movements (Experimental comparison of methods 
employing second order statistics, fourth order 
statistics and their combination) 

Our experimental setup consisted of a NeuroScan electroencephalographic 
(EEG) system (Neurosoft Inc.) equipped with a 64-channel QuickCap using 
Ag-AgCl-type electrodes. The electrical currents from the scalp were ampli- 
fied by two SynAmps amplifiers and digitally recorded using SCAN acquision 
software at a sampling rate of 1000 Hz. The biosignals were bandpass-filtered 
and electrical noise was removed outside the range 0.05-200 Hz. Electrode 
impedance was lower than 5 KOhm for all channels except for channel C4. The 
subjects were seated in a dark shielded room and the images were projected 
by two Marquee-type CRT projectors on the left side of a screen located 2.8m 
frontally. 




(14) 



If n is white of order 2, then 



Rx(p) = ARs(p)A^. 



(15) 



\/i,j ^ i 3lij € {1,...,L} : 

either E{si{t)si{t - pi. .)} ^ E{sj{t)sj{t - pi^ J} 

or cum,, (pj, ^ cuiri 3 ^. {pi. . ) , 
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Figure 3. 1. Eye movement pattern projected on a screen in front of the subjects. 



At the beginning of each eye movement trial, a pattern image appeared on the 
screen, consisting of 4 white horizontal thick lines on a black background (see 
Fig. 3.1). The end of each horizontal thick line was connected to the beginning 
of the next horizontal line by a thin diagonal line. After the pattern appeared 
on the screen, the subject had to perform the following task: fixate a small 
circular gaze point on the left side of the first line for 0.5 seconds, then move 
the gaze smoothly along each horizontal line and cross diagonally to the next 
one. At the end of the fourth horizontal eye movement, fixate the gaze point 
again for 0.5 seconds on the small circle on the right side of the line, then press 
a button to indicate that the task is finished. Pressing the button caused the 
pattern to disappear and after a 7-second break a short warning soimd called for 
attention. Additional 3 seconds were given for concentration before the next 
trial pattern appeared and the procedure was repeated. After artifact rejection 
and segmentation procedures (8192ms epochs), 19 channels corresponding to 
the international 10/20 system were extracted from the original data set for 
better visualization. The segmented eye movement data was processed using 
independent component analysis (ICA). We employed three methods in order to 
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separate the ocular electrical potentials from the evoked cortical activity: second 
order method (algorithm EVD2), fourth order method (algorithm JADETD) and 
a combined 2nd and 4th order statistics method (algorithm SFOBLpq) described 
in Section 5. These algorithms are implemented within the framework of the 
ICALAB for Signal Processing software package [7]. 

7. Results 

Our experiments showed that eye movement information could be extracted 
from the electroencephalogram without using dedicated horizontal and vertical 
oculographic electrodes. 

Single-trial eye-movement EEG data (Fig. 3.2) was processed using three 
component extraction algorithms. We compared the performance of the com- 
bined 2nd and 4th order SFOBLpq algorithm with the 2nd-order EVD2 and the 
4th order JADETD algorithms by setting two goals: 1) extraction of eye move- 
ment data in single-trial EEG recordings, and 2) separation of the horizontal 
and vertical components within the 4-line eye movements. 

The extracted components showed that the 4th-order JADETD algorithm 
(Fig. 3.3c) is not able to extract any of the relatively slow eye movement com- 
ponents, while the combined 2nd- and 4th-order SFOBLpq (Fig. 3.3a), as well 
as the 2nd-order EVD2 (Fig. 3.3b) both completed the task of extraction. In 
SFOBLpq, the number of time-delayed covariance matrices p had to be chosen 




Figure 3.2. Single-trial eye-movement EEG data recorded from electrode F8 which is located 
near the right eye. This trial exhibits two deviations from the intended movement-rest routine - 
between the 3rd and 4th line scans and at the end of the trial during the resting period. 
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Figure 3. 3. Independent components showing the performance of the three compared extraction 
algorithms: (a) combined 2nd and 4th order SFOBLpq (p=10, q=100), (b) 2nd-order EVD2, (c) 
4th order JADETD. 



to be greater than 1 and smaller than the number of time-delayed cumulant 
matrices q. Below we examined the latter two algorithms in order to compare 
their performance in the second goal - separating the horizontal and vertical eye 
movement components. We selected two components generated by each of the 
SFOBLpq and EVD2 packages using the following criteria: 1) the first com- 
ponent must have increased signal-to-noise ratio during horizontal eye move- 
ments, while the second component must complement the first one for vertical 
and diagonal movements, 2) if there are two similar components available, then 
the following selection criteria were applied by projecting components back 
into signal space (signal reconstruction): 2 A) if several similar components 
result also in similar signal changes after reconstruction then the component 
with the maximum impact (projected peak amplitude) was chosen, 2B) adding 
or removing a second component should not change or distort significantly the 
reconstructed signal from the first one. 

Fig. 3.4 shows reconstructed EEG signals after selecting and deflating both 
components for each SFOBLpq and EVD2. We found that after reconstructing 
the horizontal eye movement component the signal in right lateral channel 
F8 (according to the International 10/20 EEG system) was strongest, while 
reconstructing the vertical/diagonal component influenced most strongly right 
anterior channel FP2. Although both algorithms matched the combined eye 
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Figure 3.4. Reconstruction of eye movement signals from 2 components which represent the 
electrical activity originating from the combined horizontal and vertical motion of the eyes, 
(a) signal reconstruction for all channels using combined 2nd and 4th order SFOBLpq (p=10, 
q= 1 00), (b) signal reconstruction using 2nd-order EVD2 . Time-frequency analysis spectrograms 
in (c) and (d) show the frequency domain changes in time for chaimels F8 and FP2 which 
demonstrated strongest sensitivity for horizontal and diagonal eye movements correspondingly. 
EVD2 exhibited spurious vertical eye movement activity in chaimel FP2 at the beginning and 
near the end of the trial. 



movement, EVD2 exhibited spurious vertical eye movement activity in channel 
FP2 at the beginning and near the end of the trial. 

Fig. 3.5 and Fig. 3.6 illustrate the separate reconstruction of the horizontal 
and vertical eye movement components for SFOBLpq and EVD2. SFOBLpq 
successjfully removed a short vertical saccade at the end of the trial from the 
reconstructed horizontal movement data, while EVD2 failed (Fig.5). Further- 
more, in the vertical component projection (Fig. 3.6), SFOBLpq showed an 
advantage in extracting all 3 weak vertical eye movement potentials, while 
EVD2 entirely failed to detect the 3rd diagonal movement and exhibited false 
positives in the frequency domain at the beginning and before the end of the 
trial. 
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(a) (b) 

Figure 3.5. Reconstruction of left-to-right horizontal eye movement signal in right lateral 
electrode F8 from a single component, (a) reconstruction using combined 2nd and 4th order 
SFOBLpq (p=10, q=100), (b) reconstruction using 2nd-order EVD2. Although for horizontal 
eye movements the performance of both algorithms was relatively similar, EVD2 incorporated 
also the sharp vertical movement at the end of the trial and failed to separate it from the ‘hori- 
zontal’ component. 





Figure 3.6. Reconstruction of diagonal eye movement signal in right anterior electrode FP2 
from a single component, (a) reconstruction using combined 2nd and 4th order SFOBLpq 
(p=10, q=100), (b) reconstruction using 2nd-order EVD2. SFOBLpq showed an advantage in 
extracting all 3 weak vertical eye movement potentials, while EVD2 failed to detect the 3rd 
diagonal movement and exhibited false positives in the frequency domain at the beginning and 
just before the end of the trial. 
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8. Conclusion 

We presented a mathematical justification of maximization of cumulants 
method for ICA and give generalizations of: 1) the fixed point algorithm of 
Hyvarinen and Oja for high order cumulants, and 2) cumulants method, com- 
bining second and fourth order statistics by joint approximate diagonalization 
of covariant and cumuant matrices depending on time delays. In our eye move- 
ment EEG experiments, results indicated that the proposed combined 2nd- and 
4th-order algorithm exhibited a better performance than the 2nd-order and the 
4th-order algorithms in extracting and separating properly the strong horizontal 
movement signals from the weak diagonal movement potentials. 
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Abstract Significant progress has been made recently in epileptic seizure prediction. Fur- 
ther advances may be confronted by the absence of quantitative measures of 
structures in complex systems and challenged because the laws of the brain prob- 
ably still far away from our knowledge. In this context the paper suggests to 
seek the answers within an irreducible description of complex systems. A spe- 
cial type of processes, the formations of integer relations, is proposed to describe 
and quantize complex systems in an irreducible way. As an approximation this 
results in a new complexity measure expressed as a trace of variance-covariance 
matrix. A connection between the measure and measures being developed for 
structures in complex systems is presented. 

Keywords: brain disorders, complex systems, measures of complexity, formation processes 

of integer relations, traces of variance-covariance matrix. 

1. Introduction 

Methods of complex systems open new ways in dealing with brain disor- 
ders [1], In particular, recently significant progress has been made in epileptic 
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seizure prediction by using short-term maximum Lyapunov exponents [2, 3]. 
At the same time further advances in this direction may be confronted by the 
absence of quantitative measures of structures in dynamics of complex systems 
[4]. Moreover, the situation is challenged because the laws of the brain may be 
still far away from our knowledge and experience. 

In this context the paper suggests to seek the answers within an irreducible 
description of complex systems. A special type of processes, i.e., the formations 
of integer relations [5], is used to describe and quantize complex systems. A 
key feature of these processes is that they are irreducible. This property takes 
place because the formations of integer relations are completely controlled by 
arithmetic and have the integers as the ultimate building blocks. 

The formation processes are unusual in the sense that they have not been 
observed in physical experiments. Information about a formation process of 
integer relations can be obtained from a system of linear equations describing a 
nonlocal correlation between sequences [5, 6]. These known results are briefly 
given in sections 2 and 3 to introduce the formation processes. 

A geometrical interpretation of the integer relations formations is presented 
in section 4. The interpretation even allows to imagine the integer relations as 
some sort of particles, called integer particles [7]. 

In section 5 it is shown how to characterize and quantify structures of a 
complex system in terms of the formations of integer relations. As an approxi- 
mation this resiJts in a new complexity measure expressed in terms of a trace 
of variance-covariance matrix. The interval of the trace is used to measure 
the complexity of the structures and by this way the complexity of the states. 
Computational experiments suggest a possible quantization of the interval into 
separate regions with similar structures. 

A connection between the measure and measures being developed for struc- 
tures in complex systems [8-10] is presented. This may suggest a new per- 
spective to describe and quantize complex systems in terms of the formations 
of integer relations. 

Main results of the paper are summarized in conclusions. 

2. A Nonlocal Correlation between Sequences 

In this section a notion of nonlocal correlation between two different se- 
quences is presented [6]. The correlation produces a hierarchical structure 
consisting of parts of the sequences. It is possible to express the nonlocal 
correlation as a system of linear equations in integers. The system of equa- 
tions gives information about a special type of processes, i.e., the formations of 
integer relations [5]. 

Let I be an integer alphabet and 

In ~ ~ ^ I y i — 
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Figure 4.1. Graph of a function / = pmes[s), s = +l — 1 — l + l + l + l — 1 — 1. 

be the set of all sequences of length n > 2 with symbols in I. Let 5 > 0 and 
e > 0 be respective spacings of a space-time lattice (5, e) in 1 -I- 1 dimensions. 

Let Wes{[tm, 'tm+n]) be a class of piecewise constant functions such that a 
function / of the class is constant on (ti_ i , ti] , i = m+1, ...,m+n and equals 

fi^m) ~ /(i) ~ t ^ (^i— 

i = m + 1, m + n, ti = ie, i = m, ...,m -I- n, 

where m is an integer and Sj, i = 1, n are real numbers. Sequence s = 
si...Sn is called a code of the function f, denoted s = c(/). Figure 4.1 shows 
a function / such that 

/ € W^n([io)^8])) c(/) = -t-l — 1 — l + l + l + l — 1 — 1. 

Let pmes : s — ^ / be a mapping that associates a sequence s e In with a 
function / e Wes[tjn,tm+n], denoted / = Pmes(s), such that c(/) = s and 
whose fcth integral satisfies = 0, A; = 1, 2, ... . 

We characterize a sequence s = si...Sn & In by successive integrals 

of a function / = Pmesis)- Figures 4.2 and 4.3 illustrate the characterization 
of a sequence 

s = +l-l-l + H-l-l-l-l-l 

by showing (t), (t), where n = 8,m = 0,£ = l,5 = l. 

We consider a nonlocal correlation between two different sequences 

S — Si...Sn € fnj ^ ~ ^ 

that results in the following correlation between C{s, s') > 1 integrals 

/''kWn) =5^*’'(Wn), k=l,...,C{s,s') (1) 

of functions / = Pme 5 {s),g = Pmes{s'). The correlation between the se- 
quences s, s' does not extend to the (C(s, s') -t- l)th integrals of the functions 



( 2 ) 
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Figure 4.2. Graph of the first integral of function / in Figure 4. 1 . 




Figure 4.3. Graph of the second integral of function / in Figure 4. 1 . 




Figure 4.4. First integrals and show the correlation. Starting at to the integrals move 
differently in the interval but come together at tg. As the move from one step to another is 
restricted, the integrals are correlated to meet at the right end. 
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Figure 4.5. Second integrals and show the correlation. Starting at to the integrals 
move differently in the interval but the correlation makes them to come together at t%. 

Figures 4.4 and 4.5 present the situation for sequences 

s = +1 — 1-|-1 — 1 — 1 + 1 + 1 + 1, s — — 1 — l + l + l + l + l + l — 1. 

The correlation between the sequences s, s' results in the correlation between 
their first and second integrals 

of functions 

f = pon{s), p = poll (s')- 

An integer code series [11] expresses the nonlocal correlation in terms of a 
system of linear equations in integers and makes it possible to find formation 
processes of integer relations [5]. 

3. The Formations of Integer Relations: Processes with 
Integer Particles 

The integer code series expresses an integral of a piecewise constant fimction 

/ e W,s{[t mj tjn+n]) 

in terms of the code c(/), powers of integers and combinatorial coefficients 
[ 111 - 

Integer Code Series (V. Korotkikh, 1988). Let f € Wgs{[tm,tm+n]) be 
a piecewise constant function such that c{f) = si...Sn- Then the kth integral 
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, k >lof the function f at a point tm+i ,l = 1 , ■ ■ . , n can be given by 
k—\ 

= y^O!fcrni((m + Ifsi + ... + (m + l)*s/)e^(5 
i=0 



k 

+ (3) 

i=l 

vvAere Oikmiil^kiii i = 1, •.., fc combinatorial coefficients. 

The integer code series (3) for sequences 

S = S\...Sn G In^ S = G /ji, 

such that C(s, s') > 1 gives 
k—l 

/^*"kWn) = XI “femi((?Ti + n)*-si + (m + n - 1 )*S 2 + ... + (m + l)*s„)e''5, 
i=0 

k—1 

( Wn) = XI + nfs'i + (m + n - 1 )*S 2 + .. . + (m + l)*s^)£^5, 

i=0 

where / = pmes{s), g = Pme6(s') and A; = 1, ..., C(s, s'). 

It is proved in [5] by using (3) that if for sequences 

s — S1...S7X G f/ij s — Sj^...s^ G Ifi 

we have C(s, s') > 1, then C(s, s') < n and condition (1) reduces to a system 
of C(s, s') equations 

(m + n)°(si - s'l) + ... + (m + l)°(sn - s^) = 0 



(m + - s'l) + ... + (m + l)^(*-*')-'(sn - 4) = 0 (4) 

while condition (2) results in an inequality 

(m + n)^('’’*')(si - s[) + ... + (m + l)^("’'*')(s„ - s^) ^ 0. (5) 

Notice, that if C7(s, s') = n then system (4) appears with the matrix 

( (m + n)° (m + n — 1)° ... (m+1)*^ ^ 

{m + nf (m + n — 1)^ ... (m+1)^ 

^ (m + n)"“^ (m + n-l)"“^ ... (m + l)"~^ / 



On a New Quantization in Complex Systems 



75 



V. ♦16^-15^14^+ eW* 4^-3^- 2^* 1^=0 . . . 







2 2 2 2 2 2 22 
+1 6-1 5-14+ 13-12+11+1 0 -9 = 0 






. 6-5^* 4^-3^2Vl^=0 , 



|♦(6’-15’-14^131::b 




, ♦4’-3W* l’=0 1 




.‘12\l1+10-9’=o| 


1 -a’+yUe’-s’so . 





+1^1 5&0 


\ / 


♦ioP-4b 




+ 6° 5°= o | +4® A 






44°i3=otiAi^d 















/ 


*13 


/ 


















/ 


p] 
















[-B; 










+ 3 j 






1 ♦ 1 


1 * 1 


1 ♦ 




1 ♦ 


1 ♦ 1 ♦ 1 


1 * 1 


1 * 1 


* 1 


1 ♦ 1 


1 ♦ 1 


1 ♦ 1 


1 ♦ 1 


1 ♦ 1 


1 ♦ 1 


16 


'ib' 


' 14 


'l3 


12 


'ii'io' 


' 9 ' 


' 8 ' 


7 ' 


' 6 ' 


' 5 ' 


' 4 ' 


3 


' 2 ' 


1 



Figure 4.6. Rqjresented in this form system (6) appears as the formation of integer relations 
with integers 16, . . . , 1 as ultimate building blocks. In the formation all integer relations have 
the same organizing principle. 



whose determinant is Vandermonde one. 

For example, for the Prouhet-Thue-Morse sequences (starting with +1 and 
— 1) we have C(s, s') = 4 and (4) becomes a system of integer relations (n = 
16, m = 0 and factor 2 is ignored for clarity) 

+16° - 15° - 14° + 13° - 12° + 11° + 10° - 9° - 8° + 7° + 6° - 5° + 4° - 3° - 2° + 1° = 0 

+16^ - 15^ - 14^ + 13^ - 12^ + 11^ + 10^ - 9^ - 8^ + 7^ + 6^ - 5^ +4^ - 3^ - 2^ + 1^ = 0 
+16^ -15^ -14^ + 13^ -12^ + 11^ + 10^ -9^ -8^ + 7^ + 6^ -5^ + 4* -3^-2* + !^ =0 
+16°-15°-14° + 13®-12° + ll° + 10°-9^-8°+7°+6°-5°+4°-3°-2° + l° = 0 (6) 
whereas (5) does not follow the character in (6) 

+16^-15'*-14^ + 13^-12^ + ll'‘+10'‘-9^-8^+7^+6^-5^+4^-3^-2^ + l^ ^0. (7) 
The analysis of system (4) can identify important features [5]: 

■ integers can be seen as ultimate building blocks of integer relations of 
the system (see Figure 4.7 as illustration), 

■ there may be mpre integer relations than the system can show, 

■ there are relationships between the integer relations, which can be de- 
scribed in terms of an organizing principle. 
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Figure 4. 7. In the figure integers seem like particles that can combine and form composite 
particles. There can be positive integer particles (shown in black) and negative integer particles 
(shown in white). For example, we can see a positive integer particle +(— 3) and a negative 
integer particle —(—3). 

These features suggest to talk about some type of process connected with the 
system (4) and even imagine the integer relations as some sort of particles, called 
integer particles [7]. The system of linear equation (4) does not allow to see the 
process as in physical experiments, but makes it possible to observe the process 
implicitly by giving information about its objects and their relationships. 

In particular, system (4) can be associated with a hierarchical set 

WR{s,s',n,m,In) 

of C{s, s') > 1 levels whose elements of level fc = 1, ..., C{s, s') are integer 
relations of the form 

A\d!^ ^ + ... + Aid!^ ^ = 0 , 

where Ai,di, i = are integers di > dj+i, i = — 1 and k is 

the power of di, i = 1, ..., I [5]. It is also interpreted that elements of level 
k = 2, ..., C{s, s') of the set are formed from elements of level {k — 1) by the 
organizing principle. The formation organizing principle can be described as 
follows. If r > 1 integer relations 

Aiid'^i^ + ... + Aii{i)d\n^l^ = 0 



( 8 ) 
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Figure 4.8. The figure illustrates the formation of integer relations from integers, which both 
are viewed as integer particles. Negative integer particle 6 and positive integer particle 4 under 
the “interaction" produce -(6)° +4°, which is a composite integer particle made of them because 
-(6)°+4°=0. 



of level k = 1, C(s, s') — 1, with relation i, i = containing l{i) 

terms, satisfy 

r 

+ ••• + = 0, (9) 

i=l 

then integer relation (9) is formed from integer relations (8). Figure 4.8 illus- 
trates the formations of integer relations of the first level from integers. 

In Figure 4.6 showing (6) as the formation, we can see that integers are 
generated on the zero level and then form integer relations of the first level. 
These integer relations in turn form integer relations of the second level. The 
formation continues to the fourth level, where the integer relation, because of 
(7), can not alone form an element of the next level. 

Thus, the nonlocal correlation (1) between sequences 

S = Sl...Sji G Ifii 8 — G In 

can be described in terms of nonlocal correlations between integers (4) and is re- 
alized as the formation of integer relations, also denoted by Wi?(s, s', n, m, In). 
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A formation process T¥/?(s, s', n, m, /„) is a complexity type process. The 
level to which the formation process progresses is C(s, s') and is called a struc- 
tural complexity of sequence s with respect to sequence s' [5]. 

The formation processes of integer relations are irreducible because they 
are completely controlled by arithmetic and have the integers as the ultimate 
building blocks [5]. 

The nonlocal correlation considered in terms of the integer code series gives 
us a characterization of sequences. The characterization recognizes in the se- 
quence a hierarchical structure consisting of its parts and shows how it is formed. 
In this structure even the most distant parts may have the relationship. The struc- 
ture is sensitive to the order of the components Sj, i = 1, . . . , n in the sequence 
s = si . . . and even a minor variation in the order may lead to its significant 
changes. 

This property is quite different from how entropy characterizes complex sys- 
tems. In particular, the entropy of a system described by a discrete probability 
distribution function as a sequence s = si...s,i is given by 

H = (10) 

i=l 

where s = Si, Sj > 0, i = 1, ..., n and the system is quantified according 
to the value of (10). 

A permutation of the components Si,i = 1, ...,n of the sequence s does 
not change the entropy. Thus, entropy caimot sense the structures which the 
nonlocal correlation deals with. Entropy may stand in some sense for the 
average of the structures of the sequence’s permutations. 

Many quantitative measures of complex systems are based on entropy. For 
example Kolmogorov-Sinai entropy of a complex system could find an expres- 
sion as 

HkS = 

A <>0 

where Aj is a positive Lyapunov exponent. The conunent about entropy may 
explain why the measures of complex systems are insensitive to structures 
arising in dynamics of complex systems [4]. 

4. Geometrization of the Integer Relations Formations 

The concepts of integer and integer relations are an integral part of our mental 
equipment. However, they are too abstract to give us insight what we think the 
formation is all about. It is difficult to imagine how integer relations can be 
really formed from other integer relations, because there are no ways to visualize 
how this may actually happen. 
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Figure 4,9. The pattern (shaded) of a function / is a two-dimensional geometrical object and 
can be visualized. 



But is it possible to see how the integer relations look like and how they form 
into each other? To answer the question a new imderstanding of the integer 
relations as geometrical objects that can form into each other is developed by 
using a notion of integer pattern [5]. The notion is defined by using a notion 
of function pattern (see Figure 4.9). Integer patterns are fimction patterns with 
special properties [5]. 

It is proved by using (3) that there exists an isomorphism between ele- 
ments of a set s', n, m, In), i.e., integer relations, and elements of a set 

WPe 5 (s,s',n, m, /n),i.e.,integerpattemsofintegrals k = l,...,C{s,s'), 
where 



/ Pmss(,^ ® ^ , 



and between their formations as well [5]. The isomorphism gives a geo- 
metrical meaning to the integer relations and their formations in terms of inte- 
ger patterns and their formations. In particular, the integer relations and their 
formation processes can be quantitatively described by corresponding integer 
patterns. The organizing principle appears as function integration. 

The formation of integer relations and the corresponding formation of integer 
patterns are shown together in Figure 4. 10. We can see an integer relation by a 
corresponding integer pattern and measure by its area. The formation of integer 
patterns in Figure 4.10 reveals self-similarity, local and nonlocal symmetries. 

5. Quantization of States of a Complex System as the 
Formations of Integer Relations 

Currently, there is no systematic way to characterize and quantify the states 
ofa complex system. In the absence mainly two approaches are used: two-point 
correlation fiinctions and dynamical invariants such as Lyapunov exponents and 
fractal dimensions [4]. Both approaches are limited in applications [4, 12]. 
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Figure 4.10. The formation of integer relations and the corresponding formation of integer 
patterns (sketched) are pictured together to show their unified character. The figure allows us to 
see how the integer relations form into each other. An integer relation can be measured by the 
area of a corresponding integer pattern and the nonlocal correlation by the areas of all integer 
patterns. 



A quantization characterizes.a state of a complex system by an element of a 
structure, whose elements are distinguishable and to some extend comparable. 
A quantitative description of an element of the structure may be used to measure 
a corresponding state. The more the quantization becomes adequate with reality, 
the more we question why the structure is so special. 

The structure incorporating the formations of integer relations into one whole 
has not only rich properties for a quantization of complex systems but also, if 
consistent with experimental facts, will not allow a deeper explanatory base. In 
this quantization a state of a complex system can be described by a formation 
process of integer relations and measured by corresponding integer patterns. 
One state may be compared with another state by using corresponding formation 
processes. 

The comparison is based on the order of the formations of integer relations 
and may be interpreted in terms of complexity. A formation process produces 
integer relations of one level from integer relations of the previous one. There- 
fore, integer relations of level fc > 1 may be seen more complex than integer 
relations of level A: — 1, because they are made of integer relations of the lower 
level. Consequently, one state of a complex system described by integer re- 
lations of level k and higher may be seen more complex than another state 
described by integer relations of levels lower than level k. 
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Let information about a complex system be given by iV > 2 time series, for 
example by a number of EEG recordings, each describing a basic part of the 
system. Then a state of the system is specified by a iV x n matrix 



where a sequence Si = sn . . . Sin & In, i = 1, . . . , iV is a time series of part i 
observed at instants tj = je, j = 1, n. Let S(/„, N) be the set of all states 
of the complex system. 

The states of the complex system can be characterized and quantified by 
the formations of integer relations. In particular, a state S € S(/n, N) can be 
described by a set of formation processes 



where C{si, sj) is the structural complexity of sequence Si,i = with 

respect to sequence Sj,j = 

Let us explain the description of a state S e S(/„, iV) of the complex sys- 
tem. A formation process Sj , n, m, In) describes the part of the system 

composed of basic parts i and j and their relationship. Integer relations pro- 
duced by the formations W R{S, n, m) may be involved in formation processes 
resulting in integer relations of higher levels. 

The following rule applies and sets a correspondence between integer re- 
lations and parts of the complex system. Parts can form a composed part of 
the system if their corresponding integer relations can form integer relations of 
higher levels corresponding to the composed part of the system. As a result 
the correlation length, denoted in the complex system increases, because 
the parts become correlated in the composite part. The composite part in its 
turn may combine with other parts of the complex system. In particular, integer 
relations produced by one formation and integer relations produced by another 
formation of the same level may form integer relations of a higher level. These 
integer relations together with integer relations produced on this level by a 
different formation may form integer relations of a more higher level. 

Therefore, starting with integer relations produced by WR{S, n, m) the for- 
mation processes may compose integer relations of higher levels. This corre- 
sponds to the construction of a structure of the state S of the complex system. 
In the structure each part of the complex system is presented as a result of the 
formation from smaller parts and possibly as a block in the formations of larger 




N N 




Specified by a iV x N matrix 
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parts of the system. The geometrical interpretation of the integer relations can 
be used to measure the relationships between the parts of the complex system. 

For example, consider a state S G 8 ( 84 , 3) of a complex system consisting 
of three basic parts s, s', s" 

/ +1 -1 -1 +1 \ 

5 = -1 +1 +1 -1 ( 11 ) 

\ +1 -1 +1 -1 / 



where 



s = +l — 1 — 1 + 1 , s' = — 1 + 1 + 1 — 1 , s" — +1 — 1 + 1 — 1 

and B 4 is the set of binary sequences of length 4. For simplicity the parts and 
the sequences are denoted in the same way. 

A formation process WR{s, s', 4, 0, B 4 ) results in an integer relation 



I +4^ - 3^ - 2^ + 1^ = 0 
of the second level and formation processes 

WR{s', s", 4, 0, B 4 ), WR{s", s, 4, 0, B 4 ) 



( 12 ) 



give integer relations 

I -4» + 3» = 0~ 

I +2» - 1» = 0'"1 (13) 

of the first level accordingly. Integer relations (13) form an integer relation 

-41 + 31 + 2 ^- 11=0 I (14) 

of the second level. In a proper representation integer relations (12) and (14) 
become 

+ 8 ^ - 7 ^ - 6 ^ + 5 ^ = 0 ~ 

I -4^ + 3^ + 2^ - 1^ = 0~ 
and form an integer relation 

ppgT-7'^ - 6 ^ + 5 ^ -4^ + 3^ + 2^ - !•" =0 I 

of the third level. These formation processes of integer relations correspond to 
the formation of a structure of the complex system (see Figure 4.11). 

In particular, it can be seen that the composite part at the right on the second 
level is made of two parts, which are also composite. Each of these two parts 
is made from basic parts and both contain the basic part s", in one case with 
the relationship directed to s" and from s" in the other. Thus, it may be said 
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Figure 4.11. The figure shows the formation of a structure of a state (11). The process is 
hierarchical with the resulting structure at the top. A part at level k, 1 < A: < 3 can be positive 
or negative and is composed of parts from level A: — 1. 

that in the composite part the basic part s" is represented in a superposition. 
When observations of the composite part could be imagined to find part s" in 
one way or the other, it is tempting even to interpret this situation in quantum 
mechanical terms. 

There is a wide range of possible structures of the states. They correspond 
to the structures constructed by the formations of integer relations. From one 
side there can be no integer relations produced by WR{S,n,m,) for a state 
S € S{In,N). In this case the basic parts do not form any composite part 
and we have a structure of minimum complexity. From another side there can 
be integer relations produced by WR{S, n, m) for a state S e S(/„, JV) in an 
optimal way so that resulting integer relations of the highest possible levels can 
be formed. In this case we have a structure of maximiun complexity. 

Importantly, the structure of the structures of the states is well enough de- 
fined. It is known that an element of this superstructure, i.e., the structure 
of a state, corresponds to a formation process of integer relations and that the 
elements are organized in it by the formations of integer relations. This infor- 
mation may be helpfixl in vmderstanding phase transitions of a complex system. 
In a phase transition the structure of a complex system changes firom one to 
another. 

A question arises: how to characterize and quantify these structures of a 
complex system in an efficient way. An approach is proposed to use a set of 
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variance-covariance matrices derived from state S € S(/„, iV) and employ 
traces of these matrices [6]. 

In this section we consider the first approximation of the approach, which 
uses the variance-covariance matrix 



1^(5) — Sj)}ij=l,..,,iV) 

of the state S eS{In,N), where 

( ) ^ EjsjS j) -E j sij E^ 

is the linear correlation coefficient between Si and Sj, a^{si),a'^{sj) are the 
variances of Si and Sj , and E ( ) denotes a time average over a period considered. 
Let 

Spec{V{S)) = {Xi,...,XN) 

be the eigenvalue spectrum of the variance-covariance matrix V (S) of a state 
5e S(4,iV)and 

= Aj -l- • • • + A^ 



be the quadratic trace of the variance-covariance matrix. Let V (/„, N) be the 
set of all variance-covariance matrices V {S) of the states S € S(/„, iV). 

As the first approximation it is suggested to characterize and quantify the 
structures of a complex system described in terms of the formations of integer 
relations by using the quadratic trace tr{V^{S)) of the variance-covariance 
matrix V{S), S G S(/„, N) [6]. For this purpose it is useful to consider two 
extreme cases, i.e., one of minimum complexity structures and the other of 
maximum complexity structures. 

Mi nimum complexity structure corresponds to a state S G S{In,N) when 
all sequences of the state are the same, i.e., all basic parts i = 1,. . . ,N behave 
in the same manner. In this case the complex system can be described by one 
basic part, because it behaves as the complex system itself. There are no integer 
relations produced by WR{S,n,m) and basic parts do not form a composite 
part. In this case any combination of parts behaves as the parts themselves. 

The variance-covariance matrix V (5) of such a state S G S{In,N) is 

/ 1 1 ... 1 1 \ 

1 1 ... 1 1 

1 1 ... 1 1 

Vl 1 ... 1 1/ 



= Vmin = 



, Spec(Vmin) = (N,0,...,0). 
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Maximixm complexity structure is considered with the help of a variance- 
covariance matrix Vmax whose coefficients, except the diagonal ones, are zero 



Vmax — 



1 


0 


... 0 


0 \ 


0 


1 


... 0 


0 


0 


0 


... 1 


0 


Vo 


0 


... 0 





, SpeciVffiax) — ( 1 ; •••) 1 ) 



and is realized for a state S € S(/„, iV) such that V (5) = Vmax- 
From these two situations we have fi'om one side 

N 

tr(Vli„) = max trCV"^) = max Af = 

veV(in,N) veN(in,N)^ 

1=1 

iv2 + p2 -H • ■ • + 0^. = iv2 

N-1 



subject to 



N 

tr(y) = ^ Ai = AT, = 



i=l 



and another side 



N 



tr(V^^ 3 .) = min tr{V^) = min Af = iV 

max/ i/cvfr \t\ ^ 



V€V(/n,iV)^^ 
1=1 



(15) 



(16) 



subject to 

N 

tr(V) = J2>^i = N, \i>0,i = l,...,N. 

i=l 

Combining(15) and(16) forthevariance-covariancematrix V’(S') 6 V(/„,iV) 
of a state S e S(In,N) we obtain 

N < tr(V^{S)) < N^. (17) 

By using (17) we get an interval for the quadratic trace tr{V'^{S)) of a state 
SeS{ln,N) 

which is considered to characterize and quantify the structures of the states. In 
particular, for a state S E S(In,N) let 



Kfir{S) = 



tr(V^S))' 
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COMPLEXITY 



( I I I I I I I II ) — emtHt 






N N^ltr(V^(S)) 



1 



3 5 
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Figure 4. 12. The interval is used to measure the complexity of the integer relations structures 
and by this way the complexity of the states. Computational experiments suggest a possible 
quantization of the interval into separate regions with similar structures. A region representing 
random states is specifically shown. 

where FIR stands to show the connection with the formations of integer rela- 
tions. We have for Kpin from (18) 



The quantity Kfirs{S), called a structural complexity of the state, is sug- 
gested to measure the complexity of the structure of the state S and by this 
way the complexity of the state S itself. The structural complexity Kfir(S) 
increases from the minumun at the left end of the interval to the maximum 
boimded by the right end of the interval. Maximum structural complexity of a 
complex system, according to (19), could increase as the number of its basic 
parts N becomes bigger. 

The integer relations structures have a discrete nature, i.e., the integer re- 
lations are organized into hierarchical levels. This suggests a possible quan- 
tization of the quadratic trace interval (18) into separate regions with similar 
structures. This quantization of the interval would be useful in a quantization 
of the structures and thus in identifying qualitative changes or phase transitions 
in complex systems. 

In this context computational experiments have been made. They give some 
evidence to suggest that such a quantization of the interval (18) may be in 
place. In particular, random states have been studied by generating random 
binary sequences to construct from them random walk functions as the rows of 
the matrix S. The computations show that the structural complexity Kfir{S) 
of such a random state S independent of N belongs to a region 



as long as N and n are bigger enough. Figure 4.12 illustrates some results of 
the computational experiments. 

The structural complexity Krir is a new way to characterize complex sys- 
tems. It is important to relate Krir with other similar means for dealing with 
complex systems. A connection between the structural complexity Krjr and 
the number of KLD (Karhimen-Loeve decomposition) modes Dkld can be 



1 < Kfir{S) < N. 



( 19 ) 



3 :< Krir{S) :< 5 
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identified. It is suggested in many disciplines to use Dkld to measure the 
complexity of spatiotemporal data in large, high-dimensional, nontransient, 
driven-dissipative systems [8-10]. 

We briefly present Dk lp by using [4] to show the connection. The Karhunen- 
Loeve decomposition is a statistical method for compressing spatiotemporal 
data by finding the largest linear subspace that contains substantial variations 
of the data. 

Let u{t, x) be a one-dimensional zero-mean field on a spatial interval whose 
values are measured on a finite space-time lattice of n uniformly sampled time 
points ti = ie, i = 1, . . . ,n and of N uniformly sampled spacial points 
— 3^1 3 = 1, • • . , AT. The measurement is analogous to EEG recordings 
when spacial points are interpreted as electrode sites. Anx N rectangular data 
matrix 

■'4 = {-^ij ~ ^(^i) 

can be defined from which a N x N symmetric positive semidefinite scatter 
matrix M = A^A can be calculated, where A^ denotes the matrix transpose of 
A. The scatter matrix can be diagonalized to obtain its nonnegative eigenvalues 
A?, i = 1, . . . ,N, which can be fiirther decreasingly ordered 

Ai > > ... > A^ > 0. 

A positive integer 

Dkld = max{p : < r } ( 20 ) 

is introduced to characterize the eigenvalues. It represents the largest number 
of KLD modes p needed to capture some specified fraction r < 1 of the total 
variance A? of the data. 

We can find from (20) that Dkld similar to Kfir in (19) belongs to the 
interval 

1 < Dkld < N, (21) 

which Dkld quantizes uniformly. 

Itcanbeseenthatif/£Ti?/j? = lthenDj<-i,p = Iforr < ImdUKpiR = N 
then Dkld = N when r belongs to some interval. 

The connection allows to apply the structural complexity Kfir for com- 
plex systems being studied by Dkld- More importantly, it reveals Dkld in 
the context of the Kfir conceptual framework, i.e., the formations of integer 
relations. In particular, this shows where complexity interpretation of Dkld 
may come firom and that a different quantization of the interval (21), not the 
uniform one of Dkld, could be more adequate for complex systems. 
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6. Conclusions 

Significant progress has been made recently in epileptic seizure prediction 
[2, 3]. Further advances in dealing with brain disorders may be confronted 
by the absence of quantitative measures of structures in complex systems [4]. 
Moreover, the situation is challenged because the laws of the brain may be still 
far away from our knowledge and experience. 

In this context the paper suggests to seek the answers within an irreducible 
description of complex systems. In particular, a special type of processes, i.e., 
the formations of integer relations [5], is proposed to describe and quantize 
complex systems in an irreducible way. As an approximation this results in a 
new complexity measure expressed in terms of a trace of variance-covariance 
matrix. 

A coimection between the measure and measures being developed for struc- 
tures in complex systems is presented. This may suggest a new perspective to 
describe and quantize complex systems in terms of the formations of integer 
relations. 
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Abstract Epilepsy is characterized by the sudden occurrence of seizures disturbing the 
perception or behavior of epileptic patients. Several prediction methods have 
claimed to be able to predict seizures based on EEG-recordings minutes in ad- 
vance, which opens up new approaches to treat the disease. However, the term 
seizure prediction is not unequivocally defined and different assessment criteria 
for prediction methods exist which impedes the comparison between methods. 
Moreover, only little attention is paid to the dependency between sensitivity and 
false prediction rate. We address these shortcomings and introduce a terminology 
and assessment criterion for seizure prediction methods based on statistical and 
clinical considerations: the seizure prediction characteristic. 

Keywords: Epilepsy, seizure prediction, seizure prediction characteristic, sensitivity, false 

prediction rate 



1. Introduction 

In his work “On the Sacred Disease” Hippocrates (400 B.C.) refutes a curse 
as origin of epileptic seiziures but suggests a brain disorder. Today we know that 
during seizures abnormal electrical discharges in the brain impairs its function- 
ing. Astonishingly, five in hundred persons suffer from one epileptic seizure 
during life and one in hundred persons have recurrent seizures. Anticonvulsive 
drugs or siurgery therapies cannot help 12% of the patients. These patients have 
to cope with the incessant uncertainty of arising seizures. A method to pre- 
dict seizures could improve therapeutic strategies dramatically. So far, several 
seizure prediction methods based on the analysis of intracranial and scalp EEC 
have been suggested with promising results which are siunmarized in table 5.1. 
For a review see [Litt and Lehnertz, 2002] and [Lift and Echauz, 2002]. 

A seizure prediction method could be applied clinically by triggering an 
intervention system which is able to control an arising seizure, for example 
via the administration of strong anticonvulsive drugs into the epileptic focus or 
electrical stimulation of the vagus nerve. A simple intervention would be the 
warning of the patient. He could avoid dangerous situations like a busy street 
or a swimming pool. 

Figure 5 . 1 provides an example of how a seizure prediction method works. A 
mathematical algorithm extracts a “feature” from the EEG recording. Once this 
feature crosses a specific threshold level, an alarm is triggered. A comparison 
of interictal periods far away from any seizure and pre-ictal periods resulting 
in seizure onset leads to the choice of a suitable threshold value. In this case. 
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Author Seizure prediction method 

# Patients # Seizures Interictal MPT Sensitivity FPR 


[Martinerie et al, 1998] correlation density 

11 19 Oh 2.64 min 89% / 


[Osorio et al., 1 998] based on frequency analysis 

13 125 34 h 0.25 min 92% OFP/h 


[Lehnertz and Eiger, 1998] effective correlation dimension 

16 16 5.2-34.7 h 11.5 min 94% OFP/h 


[Le Van Quyen et al. , 1 999] dynamical similarity index 

13 23 Oh 5.75 min 83% / 


[Le Van Quyen et al., 2000] dynamical similarity index 

9 17 Oh 4.45 min 94% / 


[lasemidis et al. , 200 1 ] Lyapunov exponent 

5 58 Oh 49.1 min 91% / 


[Le Van Quyen et al., 2001 ] dynamical similarity index (surface EEG) 

23 26 0 h 7 min 96% / 


[Lehnertz et al, 2001] effective correlation dimension 

59 95 > 1 1 5 h 19 min 47% 0 FP/h 


[ Jerger et al . , 200 1 ] seven different prediction methods 

4 12 Oh l-3min / / 


[De Clercq et al., 2002] similarity index, correlation dimension (surface EEG) 

12 12 Oh / 0% / 


[Schindler et al., 2002] leaky integrator and fire unit (surface EEG) 

7 15 144 h 4-330 min 100% >0.014 FP/h 


[Navarro et al., 2002] dynamical similarity index 

11 41 12-60 h 7.54 min 83% 0.3 FP/h 


[Mormann et al., 2003] phase coherence, linear cross correlation 

10 14 15 h 86/ 102 min 86% OFP/h 



Table 5.1. Achievements of seizure prediction methods developed to date. Listed are the num- 
ber of patients and seizures investigated, the total duration of interictal EEG data for calculation 
of the false prediction rate, the mean prediction time (MPT), sensitivity, and the rate of false pre- 
dictions per hour (FPR). False prediction rates of 0 FP/h mean that no false prediction occurred 
for the investigated EEG data. All but three studies were done with intracranial EEG data. 



lower threshold values correspond to a higher sensitivity since more seizures 
can be predicted correctly. Consequently, more false predictions occur during 
the interictal epochs. The tight dependency between sensitivity and the false 
prediction rate holds for every prediction method. 

Three shortcomings exist which need to be resolved for further development 
of seizure prediction methods: 

1 Different assessment criteria of seizure prediction methods exist and the 
term “seizure prediction” is not unequivocally defined. 
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Figure 5.1. Dependency between sensitivity and false prediction rate. The upper and middle 
panels display an example for EEG data (a) and an extracted feature (b) used by the seizure 
prediction method “increments of the accumulated energy”. Below, one hour interictal (c) and 
two hours pre-ictal epochs are shown (d,e). Bold vertical lines mark seizure onsets. Upward 
crossing of a threshold (dashed line) triggers an alarm. Three different thresholds illustrate the 
dependency between sensitivity and false prediction rate: For Ti no alarm occurs neither during 
pre-ictal nor interictal epochs, meaning zero sensitivity and zero false predictions. Threshold 
T 2 leads to the correct prediction of the second seizure in (e) in a time interval 20 minutes 
before seizure onset, at the expanse of one false prediction during the interictal epoch in (c). 
Decreasing the threshold to T 3 in order to predict the first seizure in (d), produces another false 
alarm. Evaluation of a prediction method should require the simultaneous assessment of both 
sensitivity and false prediction rate. 



2 Generally little attention has been paid to the dependency between sen- 
sitivity and false prediction rate. The performances of most prediction 
methods summarized in table 5.1 for example are characterized only by 
their sensitivity without calculation of the false prediction rate. 

3 All prediction methods were developed and tested on different EEG data 
pools, making it difficult to compare their performance. 

Osorio et al. suggested in 1998 that prediction methods should be evalu- 
ated by both, sensitivity and false prediction rate ( [Osorio et al., 1998]). We 
extended this approach and developed an assessment criterion called “seizure 
prediction characteristic” ( [Winterhalder et al., 2003a]). It takes into account 
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Figure 5.2. Definition of a correct prediction: The seizure doesn’t occur before the end of 
the seizure prediction horizon (SPH). This time interval is followed by the seizure occurrence 
period (SOP), during which the seizure occurs, but the exact point of time is unknown. Seizures 
outside of any SOP are not predicted and therefore are considered false negatives. Alarm signals 
without a seizure in the following seizure occurrence period are false predictions. 



statistical and clinical considerations and makes it possible to assess and com- 
pare different seizure prediction methods. 

In the following, we present a suitable terminology for seizure prediction 
and the motivation, definition and eventually the calculation of the seizure 
prediction characteristic. An application of the assessment criterion on three 
seizure prediction methods can be fomid in this volume ( [Winterhalder et al, 
2003b]). 

2. The Seizure Prediction Characteristic 
2.1. Terminology 

A seizure prediction method has to forecast an impending epileptic seizure 
by raising an alarm in advance of seizure onset. A perfect prediction method 
would indicate the exact point in time when a seizure is to occur. This ideal 
behavior is not expected for current prediction methods analyzing EEG data. 
The uncertainty can be considered by use of the seizure occurrence period SOP, 
which is defined as a time period during which the seizure is to be expected 
(Fig. 5.2). In addition, in order to render a therapeutic intervention possible, a 
minimum window of time between the alarm raised by the prediction method 
and the beg inni ng of SOP is essential. This time window is denoted as seizure 
prediction horizon SPH. Taking into account the two time periods SPH and 
SOP, a correct prediction is defined as follows: After the alarm signal, during 
SPH, no seizure has occurred yet. During SOP a seizure occurs. The exact 
time of seizure onset may vary within SOP, thereby reflecting the uncertainty 
of the prediction. Seizures outside of any SOP are not predicted by the system 
and therefore classified as false negatives. Alarm signals without a seizure 
dining SOP are false predictions. 

Two measures to describe a prediction method performance for given SPH 
and SOP include: 
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■ sensitivity, defined as the fraction of correctly predicted seizures of the 
total seizures 

■ false prediction rate, the number of false predictions per time interval 

As discussed above, these measures are not independent. 

2.2. Clinical Considerations 

Single false predictions are not avoidable in a realistic setting. Measure- 
ments in large complex systems like the human brain are subject to fluctuations 
likely producing false alarms, if the investigated time interval is long enough. 
Should false alarms occur, the patient prepares for the arising seizure in vain. In 
the case of electrical stimulations or administration of drugs, unnecessary side 
effects may occur. For a too large number of false predictions per time interval, 
patients will not take further alarms seriously or will suffer from psychological 
stresses. Side effects of repeated interventions will add up and may lead to a 
nemophysiological impairment. Depending on the patient and chosen inter- 
vention system, a maximum false prediction rate FPRmax has to be defined 
that is acceptable firom a clinical point of view. 

The average seizure incidence may be a basis through which reasonable 
values for FPRmax can be chosen. [Bauer and Burr, 2001] evaluated seizure 
diaries of 63 patients resistant to anticonvulsant treatment. Based on nearly nine 
years of documentation and about 313 seizures per patient on average, the mean 
seizures rate amounts to 3 per month. Reduction of anti-epileptic drugs, e.g., 
during presurgical monitoring, leads to increased seizure frequencies. [Haut 
et al., 2002] investigated seizure clustering for 91 patients with medically in- 
tractable epilepsy who underwent monitoring for presurgical evaluation. The 
averaged maximal number of seizures in a 24-hour period during monitoring 
increased to 3.6 seizures per day compared to the low number under normal 
conditions. Higher values of FPRmax are questionable with respect to possi- 
ble clinical applications. Even if all seizures can be predicted correctly, at least 
50% of all alarms would be false alarms for patients during monitoring. This 
percentage increases to 97% for epileptic patients under normal conditions. 

Similar constraints exist for SPH and SOP, depending on patient and in- 
tervention system. The application of anticonvulsive drugs needs a certain time 
period until taking effects due to the distribution of the drug, passage through 
the blood brain barrier and effect on the target neuron. Here, a minimum seizure 
prediction horizon SPHmin is required. Electrical stimulation is supposed to 
be fast acting and may require only a few seconds. If the patient is only warned, 
SPHmin increases to tens of seconds - enough time for the patient to leave a 
dangerous situation. 

Because the exact point of time for seizure onset is unknown, interventions 
should have effects lasting for the whole seizure occurrence period. Too large 
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durations for SOP may require further administrations of anticonvulsive drugs 
or long electrical stimulations. In the case of a warning system, the patients’ 
psychological stress increases with longer SOP, because a seizure is expected 
at any moment dixring this time interval. Thus, too long SOP would increase 
the patient’s anxiety. The physiological and psychological stress determines an 
upper bound for SOP, the maximum seizure occurrence period SOPmax- 

2.3. Statistical Considerations 

To be regarded as a prediction method, the performance of a seizure predic- 
tion method has to be superior to a prediction in a random, periodical or another 
nonspecific manner, independent of any prior information. Figure 5.3 displays 
how seizures can be predicted correctly by chance: In general, the parameters 
of a seizure prediction method will be adjusted to increase sensitivity until the 
false prediction rate equals the upper boimd FPRmax- Then, during a small 
interictal time interval / the probability for an alarm is p = FPRmax I- Ob- 
serving a longer time interval W the probability for at least one alarm can be 
calculated as follows: 



p(no alarm in I) 
p(no alarm in W) 

p(at least one alarm in W) 
p(at least one alarm in W »-f) 



1 - FPRmax I 

W 

(1 - FPRmax I) ^ 

w 

1 -{1- FPRmax I)~ 

^ g PP Pmax tv 



With W = SOP this is exactly the sensitivity 5 of a random prediction method, 
because it is the probability of at least one alarm during the seizure occurrence 
period. 

A periodical prediction method raises alarms regularly after a certain period 
of time. If during interictal phases the false prediction rate equals FPRmax, the 
probability and therefore sensitivity S for an alarm during the seizure occurrence 
period SOP is 

S = m\n{FPRrnax SOP , 100 %} 

For large values of SOP or FPRmax, both the random and the periodical 
prediction method achieve high sensitivities approaching 100% (Fig. 5.3). 
This happens independently of the value for the seizure prediction horizon. 
Regarding a maximum false prediction rate of 1.0 false predictions per hour 
(FP/h) and a seizure occurrence period of 50 minutes, the random prediction 
method achieves a sensitivity of 57% and the periodical prediction method a 
sensitivity of 83%. Hence, for too high maximum false prediction rates or 
too long seizure occurrence periods, the performance of any specific seizure 
prediction method can not be distinguished from the results of these unspecific 
prediction methods. 
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Figure 5.3. Unspecific prediction methods. Upper panel: Periodical prediction method raising 
alarms after a certain period of time. The random prediction method raises alarms by chance. 
Since seizure occurrence periods can have an overlap, sensitivity is a bit worse than for the 
periodical prediction method. Lower panel: Sensitivity depending on FPRmax for SOP = 50 
minutes. Both methods converge to high sensitiyity values for too high FPRmax- For example 
with FPRmax = 1 FP/h, sensitivity amounts to 83% for the periodical prediction method. 
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2.4. Assessment Criterion: The Seizure Prediction 
Characteristic 

The values for FPRmax, SPHmin, and SOPmax depend on a particular 
clinical application i.e. a patient and an intervention system. This is generally 
unknown during the development of a seizure prediction method. Therefore, 
the method’s sensitivity should not be calculated for a fixed setting but for 
a reasonable range of values for FPRmax, SPH, and SOP, leading to the 
seizure prediction characteristic 

S = S {FPRmax, SPH, SOP). 

This approach enables the assessment and comparison of seizure prediction 
methods independently of any particular clinical application. As a minimum 
requirement, a prediction method should be superior to unspecific methods like 
the random or periodical one by achieving a significant higher seizure predic- 
tion characteristic. 

The calculation of the seizure prediction characteristic to evaluate a predic- 
tion method comprises five steps: 

1 Specification of the number of maximum tolerated false predictions dur- 
ing interictal periods FPRmax, SPH, and SOP. 

2 Adjustment of parameters of the prediction method, for example the value 
of a threshold, until the false prediction rate equals FPRmax for every 
single patient. Interictal data sets of at least \/ FPRmax duration for 
each patient are required for this procedure. 

3 Calculation of sensitivity S using the pre-ictal data sets of each patient. 

4 Averaging the values of sensitivity for all patients. 

5 Repetition of these steps for a reasonable range of values for FPRmax, 
SPH, and SOP. Eventually the seizure prediction characteristic 

S {FPRmax, SPH, SOP) can be estimated. 

3. Conclusion 

We suggest application of the seizure prediction characteristic as a fimction 
of sensitivity S and the maximum false prediction rate FPRmax, Ae seizure 
prediction horizon SPH, and seizure occurrence period SOP, to determine the 
performance of a seizure prediction method. In this way, it is possible to assess 
and to compare prediction methods and to choose a suitable method for a par- 
ticular patient and type of intervention. The minimum requirement for a seizure 
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prediction method is a significantly higher seizure prediction characteristic than 

for unspecific prediction methods. 
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Abstract Several methods have been suggested to predict the onset of epileptic seizures 
from EEG data. We evaluated the performance of three predictions methods: 
the “dynamical similarity index”, the “effective correlation dimension” and an 
extended, prospective version of the “accumulated energy”. These prediction 
methods were applied on a large pool of intracranial EEG data from 21 patients. 
Altogether, 582 hours EEG data and 88 seizures were investigated. 

The “seizure prediction characteristic” was used as assessment criterion. It 
considers the strong dependency between sensitivity and the false prediction 
rate. For a rate of 1 to 3.6 false predictions per day, the similarity index yields a 
sensitivity between 2 1 % and 42%, which was the best result of the three examined 
prediction methods. The extended version of the accumulated energy achieves a 
sensitivity between 18% and 31%, the effective correlation dimension between 
13% and 30%. 

Keywords: Epilepsy, seizure prediction, seizure prediction characteristic, intracranial EEG 

data, false prediction rate 



1. Introduction 

The major characterizing symptoms of epilepsy are seizures affecting con- 
trol of movement, consciousness or perception. The unforeseeable occurrence 
of seizures leads to great mental strain for the patients. About 0.7% of the 
population in the industrial coimtries suffer from this disease [Hauser et al., 
1993]. 

It is possible to control epileptic seizures using anticonvulsive drugs in 70% 
of all patients. In a subgroup of 18% of all patients, seizure control can be 
achieved by epilepsy surgery [Wiebe et al., 2001]. However, 12% of all patients 
can neither be treated by medication nor by epilepsy surgery. 

Therefore, new therapeutic options are necessary. One option to treat epilepsy 
might be the development of a device to prevent a seizure in advance of seizure 
onset, called “brain defibrillator” in analogy to cardiac defibrillators [Milton 
and Jung, 2003]. For this purpose, two important problems have to be solved. 
One problem is how to control the generation of epileptic activity. This might be 
solved by using interventions like electrical stimulation or drug delivery using 
a minipump [Nicolelis, 2001]. 

A different challenge is to predict the onset of an upcoming seizure. Here, 
special interest is focussed on the prediction of epileptic seizures from in- 
tracranial or scalp EEG data by methods based on nonlinear and linear time 
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series analysis. Several investigations claim promising results [Lehnertz and 
Eiger, 1995, Lehnertz and Eiger, 1998, Lehnertz et al., 2001,Iasemidis et al., 
1990,Iasemidis et al., 2001, Le van Quyen et al., 1999, Le van Quyen et al., 
2001b, Le van Quyen et al., 2001a,Navarro et al., 2002, Litt et al., 2001, Mor- 
mann et al., 2000, Schindler et al., 2002, Jerger et al., 2001]. However, most 
of these studies have been evaluated using only a few seizures, a low number 
of patients or insufficient interictal, seizure-free data sets. Furthermore, no 
recognized performance standards to assess prediction methods exist [Litt and 
Lehnertz, 2002], making it difficult to compare different methods. 

We present the assessment and comparison of three univariate seizure predic- 
tion methods: the “dynamical similarity index” [Le van Quyen et al., 1999,Le 
van Quyen et al., 2001b, Le van Quyen et al., 2001a, Navarro et al., 2002], 
the “effective correlation dimension” [Lehnertz and Eiger, 1995, Lehnertz and 
Eiger, 1998, Lehnertz et al., 2001] and an extended, prospective version of the 
“accumulated energy” [Litt et al., 2001]. 

The “seizure prediction characteristic” was used as assessment criterion 
[Winterhalder et al., 2003, Maiwald et al., 2003]. It mirrors the functional 
relation between sensitivity S and three measures characterizing a prediction 
method: First, the maximmn permitted number of false predictions per hoiu 
FPRmax during interictal states. Second, the seizure prediction horizon SPH, 
representing the time between an alarm and the earliest possible seizure onset. 
And third, the seiziue occurrence period SOP during which a seizure is sup- 
posed to occur at any time. 

We calculated the seizure prediction characteristic for the three prediction 
methods using intracranial EEG data of 21 patients. Our data pool consist of 
2-5 seizures and at least 24 hours interictal EEG data for each patient, altogether 
88 seizures and 582 hours EEG data. It is intended to publish this still growing 
data pool as an open source for the development and evaluation of prediction 
methods. 

The EEG data pool is described more detailed in the next section. Part 3 
summarizes the concepts of the three investigated prediction methods. The 
steps to calculate the seizure prediction characteristic are reported in part 4, the 
results are given in part 5. 

2. EEG data and patients characteristics 

The prediction methods were applied using intracranial EEG data from 21 
patients suffering from pharmacorefractory focal epilepsy of temporal and ex- 
tratemporal origin. These data were recorded during presurgical epilepsy mon- 
itoring using a sampling rate of 256 Hz or 5 12 Hz. 

Invasive electrodes were used in order to study the EEG data at a high signal 
to noise ratio. Depth electrodes were implanted stereotactically and subdural 
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electrodes via burr holes or open craniotomy. Preparation of the data included 
bandpass filtering in the frequency domain between 0.5 and 1 20 Hz respectively 
80 Hz for the effective correlation dimension. A 50 Hz notch filter eliminated 
possible line noise. 

A certified epileptologist selected six electrodes of all implanted electrodes, 
which were referenced to an electrode displaying a minimal amount of epileptic 
activity. For every patient, 2-5 seizures (mean 4.2) were examined, including 
pre-seizure EEG data of at least 50 min duration. This pre-seizure data sets 
were used to calculate the sensitivity of the prediction methods. 

To evaluate false prediction rates, long-term interictal, seizure-free EEG data 
are necessary. We investigated 24 hours interictal data for each patient. For 
13 patients, 24 hours of contiguous interictal recordings were available. In the 
remaining cases, seizure-free periods were shorter than 24 hours, thus a few 
contiguous intervals were combined to at least 24 hours duration. 

3. Seizure Prediction Methods 

We examined the performance of three predictions methods: the dynamical 
similarity index, the effective correlation dimension and an extended, prospec- 
tive version of the accumulated energy. The main steps to calculate these 
prediction methods are summarized in this section. 

3.1. Dynamical Similarity Index 

The dynamical similarity index was implemented as introduced [Le van 
Quyen et al., 1999]. Its capability to predict seizures several minutes in advance 
was demonstrated in several studies [Le van Quyen et al., 2001b, Le van Quyen 
et al., 2001a, Navarro et al., 2002]. The similarity index algorithm compares the 
dynamical behavior in a sliding window St to a fixed reference window Sref, 
which is chosen far away from any seizure. 

As a first step, new time series In,n € N, are constructed by comput- 
ing time intervals between two positive zero-crossings of the EEG data. A 
delay embedding with embedding dimension m and delay t leads to A„ = 
{In, In-T, ■ ln-(mr)+i)- Next, a singular value decomposition for the trajec- 

tory matrix A(5re/) ofthe reference window is applied. The trajectory matrices 
A.{St) and A{Sref) are projected on the principal axes of the reference win- 
dow, jdelding X(5t) respectively X(5're/)- A random selection Y{Sref) of 
X{Sref) is compared to X(5t) via the cross-correlation integral 



^ref Nt 



C{Sref,St)^^ 0(r-||yi(5,e/)-X,(5t)||). 

iVre/iVt 
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Here, 0 denotes the Heaviside step function, || • || the euclidian norm, and 
Nref respectively Nt the number of vectors of the reference and sliding window. 
The distance r is chosen as the 30% quantile of the cumulative neighborhood 
distribution of the reference window. Finally, the dynamical similarity index 
7 (St) is given by 



. C(Sr.„S,) 

^C{S„,,S„,)C(S„S,) 

Threshold crossing with the constraint of a minimum crossing time of 150 
seconds was used as alarm signal. The threshold was varied. 

3.2. Effective Correlation Dimension 

Based on the correlation dimension D 2 , which is an estimator for the fractal 
dimension of the attractor of a deterministic dynamical system, the performance 
of the effective correlation dimension to predict seizures was investigated 
[Lehnertz and Eiger, 1995,Lehnertz and Eiger, 1998,Lehnertz et al., 2001]. 

The calculation starts with a delay embedding of the EEG time series for 
several dimension values up to m = 25, leading to The correlation sum 

C'm(r’) = jy/jY _ ~ “ ®m0)||) 

^ ^ i^j 

is calculated for a range of the radius r. Here, 1 1 • 1 1 denotes the maximum 
norm. The correlation dimension is defined as 

r-*o d log(r) 

The limit requires a proper scaling region, which is not necessarily given for 
measured data. To overcome this problem, the authors introduced an operational 
method leading to the so called effective correlation dimension This 

measure is applied on the EEG data using a sliding window technique. 

In [Lehnertz et al., 2001] so called “dimension drops” were evaluated, char- 
acterized by the time interval tdrop and maximal deviation ddrop, th® values of 
dropped xmder a threshold. The average of during interictal periods 

was used as threshold. A dimension drop was defined as being predictive if it 
directly precedes a seizure onset and the dropping parameters extend the maxi- 
mum dropping parameters during interictal periods. Use of these criteria leads 
to no false predictions, but in our investigation only one of all 88 seizures was 
preceded by such a predictive dimension drops. The long interictal EEG data 
used in our study might be a possible explanation for this result. Allowing false 
predictions increased sensitivity and was necessary to calculate the seizure pre- 
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diction characteristic. This was achieved by varying the dropping parameters 
^drop Snd ddyop' 

3.3. Increments of Accumulated Energy 

Recently, the discriminating power of the accumulated energy with respect 
to pre-ictal and interictal periods of 50 min duration was investigated [Lift 
et al., 2001]. About 90% of the pre-ictal and 88% of the interictal periods 
were classified correctly. Unfortunately, this method requires knowledge of 
seizure onset which is not given in a prospective analysis. Based on these 
promising results, we examined an extended, prospective version, the so called 
“increments of the accumulated energy.” 

The accmnulated energy AE{k) is based on the “average energy” 

Ek = mean(xf) for time window fc (A: = 1, 2, ...) 

calculated for a time window of 1 .25 s, Xi denotes the electrode potential of 
sample i. Two consecutive time windows are shifted by 0.45 s. The accumulated 
energy respectively the increments of the accumulated energy iAE are defined 
as 



j 10m 

= Yq ^ ^ + -4J5/771 — 1, 

k—lOm—9 
^ 10m 

iAEyn = — ^ ^ Efi = AEfti AEm-i- 

fe=10m— 9 

A higher slope of corresponds to higher increments iAE. U sing a median 

filter over 90 seconds ensures that only permanent changes of these increments 
lead to different values of iAE. Threshold crossing of iAE was used as alarm 
signal and the threshold value was varied. 

4. Calculation of the seizure prediction characteristic 

The seizure prediction characteristic [Winterhalder et al., 2003] was used 
to assess the three prediction methods. It is a means to evaluate and compare 
seizure prediction methods. It is based on the functional relation of sensitivity 
S on three measures characterizing the performance of a given seizure predic- 
tion method, namely the maximum false prediction rate FPRmax, the seizure 
occurrence period SOP and the seizure prediction horizon SPH. The theo- 
retical background is also reported in another paper in this volume [Maiwald 
et al., 2003]. 

The calculation of the seizure prediction characteristic for a given prediction 
method consists of five steps; 
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1 Specify the number of maximum tolerated false predictions during in- 
terictal periods, the maximum false prediction rate FPRmax- Fix the 
seizure occurrence period SOP, which is the time during which a seizure 
is supposed to occur and the seizure prediction horizon SPH, which mir- 
rors the time between an alarm and the earliest possible seizure onset. 

2 Adjust parameters of the prediction method, e.g. the value of a thresh- 
old or the time interval, a feature has to stay below a threshold until the 
maximum false prediction rate is reached during interictal periods, in- 
dividually for each patient. Interictal data sets of at least 1/ FPRmax 
duration are required for this procedure. 

3 Calculate the sensitivity using the pre-ictal data sets of each patient. 

4 Average the values of sensitivity for all patients. 

5 Finally, repeat these steps for a reasonable range of values for the maxi- 
mum false prediction rate FPRmax, the seizure occurrence period SOP 
and prediction horizon SPH. 

This procedure leads to the seizure prediction characteristics given by sen- 
sitivity S depending on FPRmax, SOP and SPH, 

S{FPRmax,SOP,SPH). 

5. Results and Discussion 

As the seizure prediction characteristic depends on three different measures, 
it is necessary to fix at least one of them to display the result. In the following, 
the assessment and comparison of the three prediction methods is presented 
by the seizure prediction characteristic depending on, first the maximum false 
prediction rate, second the seizure occurrence period, and third the seizure 
prediction horizon. The values of sensitivity S of all three seizure prediction 
methods are compared to each other and to two unspecific prediction methods, 
the periodical and the random prediction methods [Maiwald et al., 2003]. Fi- 
nally, dependence of sensitivity S on two measures (FPRmax and SOP) is 
shown. 

5.1. Sensitivity depending on FPRmax 

First, dependence of sensitivity S on the maximum false prediction rate 
FPRmax for the dynamical similarity index (o), the increments of accumulated 
energy (o) and the effective correlation dimension (x) is shown (Fig. 6.1). 

The seizure occurrence period was fixed to 30 min. and the seizure prediction 
horizon to five seconds corresponding to a very fast intervention system. The 



no 



QUANTITATIVE NEUROSCIENCE 




Figure 6. 1. Dependence of sensitivity on the maximum false prediction rate for given seizure 
occurrence period and prediction horizon, for the three examined prediction methods. The dotted 
line displays the performance of the random, the solid line of the periodical prediction method. 
All three prediction methods yield better results than the two unspecific ones. The dynamical 
similarity index achieves highest sensitivity values. 



vertical lines mark the mean seizure frequency under normal conditions (left) 
[Bauer and Burr, 2001] and the averaged maximum seizure frequency during 
presurgical monitoring (right) [Haut et al., 2002]. 

The logarithmically scaled maximum false prediction rate covers three re- 
gions. Evaluation of false prediction rates corresponding to the mean seizure 
frequency imder normal conditions of about three seizures per months requires 
interictal EEG data of several days. The smallest value for FPRmax, which can 
be evaluated using an EEG data pool of 24 hours interictal data for each patient, 
amounts to one false prediction per day respectively 0.04 false predictions per 
hour (FP/h). 

For maximum false prediction rates smaller than the averaged maximum 
seizure frequency during monitoring, the dynamical similarity index achieves 
the best result, sensitivity amounts to 21% to 42%. The increments of ac- 
cumulated energy reach sensitivity values between 18% - 31%, the effective 
correlation dimension between 13% - 30%. 

For higher values of the maximum false prediction rate up to one false pre- 
diction per hour, sensitivity of all three prediction methods achieves high values 
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Figure 6.2. Dq)endence of sensitivity on the sei 2 xire occurrence period for given maximum 
false prediction rate and seizure prediction horizon. Again, all three prediction methods are 
superior to the unspecific prediction methods (dotted and solid line). The dynamical similarity 
index achieves the best result for all evaluated seizure occurrence periods. 



of70%- 100%. But accepting a false prediction rate of e.g. one false prediction 
in every five hours, 57% of the alarm events would be false ones, if the patient 
suffers from a seizure frequency corresponding to averaged maximum seizure 
frequency during monitoring. Furthermore, compared to the mean seizure fre- 
quency under normal conditions, 98% of the alarm events would be incorrect. 
Hence, maximum false prediction rates higher than the averaged maximum 
seizure frequency during monitoring are questionable. 

All three prediction methods yield better results than the unspecific prediction 
methods independent of the investigated maximum false prediction rate. 

5.2. Sensitivity depending on SOP 

Next, values of sensitivity S depending on the seizure occurrence period are 
given. The maximum false prediction rate is fixed to 0.1 5 FP/h and the seizure 
prediction horizon again to 5 seconds (Fig. 6.2). The dynamical similarity index 
again yields the best result. For small seizure occurrence periods, the increments 
of accumulated energy are superior to the effective correlation dimension. 

For seizure occurrence periods longer than 36 min, sensitivity of the dynam- 
ical similarity index increases slower than the unspecific prediction methods. 
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Figure 6.3. Sensitivity depending on the seizure prediction horizon for fixed values of the 
maximum false prediction rate and seizure occurrence period. As the random and periodical 
prediction methods do not depend on the prediction horizon, their values for sensitivity are 
constant (dotted and solid line). 



Thus, the increase in sensitivity for larger values of SOP is rather a statistical 
property than a performance feature of the prediction method. This effect can 
also be detected for the increments of the accumulated energy for SOP > 20 
min. 

5.3. Sensitivity depending on SPH 

The next example shows dependence of sensitivity S on the seizure prediction 
horizon SPH (Fig. 6.3). The seizure prediction horizon corresponds to the 
actual prediction performance of a prediction method. The maximum false 
prediction rate is again fixed to 0.15 false predictions per hour, the seizure 
occurrence period to 30 min. 

All three prediction methods show constant sensitivity values for seizure 
prediction horizons < 2 min. This is a promising result as this prediction 
time might be sufficient for most interventions, e.g. electrical stimulation. 

5.4. Sensitivity depending on FPRma^ and SOP 

As a last example, sensitivity S is displayed depending on two of the three 
measures, the maximum false prediction rate and the seizure occ\irrence period. 
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FPR / FP/h 
max 



SOP / minutes 



Figure 6.4. Sensitivity for the dynamical similarity index depending on the maximum false 
prediction rate FPRmax and the seizure occurrence period SOP, for a given seizure prediction 
horizon of 5 seconds. For several combinations of FPRmax and SOP, sensitivity reaches high 
values. But the applicability in a therapeutical device using these combinations is questionable 
due to the high values of the maximum false prediction rate respectively seizure occurrence 
period. 



The seizure prediction horizon was fixed to 5 seconds. The example shows 
values for sensitivity for the dynamical similarity index (Fig. 6.4). 

It illustrates an important fact: it is possible to achieve values for sensitivity 
of 80% - 100% for a wide range of the maximum false prediction rate and the 
seizure occurrence period. This is also shown in a gray-scaled figure for the 
same example (Fig. 6.5). The upper right comer corresponds to high sensitivity 
values. For example, for a seizure occurrence period of 40 min and one false 
prediction every two hours, sensitivity amounts 86%. But for a patient imder 
normal conditions suffering firom three seizures per month, 98% of the alarm 
events would be false ones. 

The important question is, whether these combinations of values for the max- 
imum false prediction rate, seizure occurrence period, and prediction horizon 
are suitable for a therapeutical device to control seizures. 

6. Conclusion 

We have shown the assessment and comparison of three prediction methods. 
This was possible by using the seizure prediction characteristic, which relates 
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Figure 6. 5. Gray-scaled plot of sensitivity for the dynamical similarity index depending on the 
maximum false prediction rate FPRmax and the seizure occurrence period SOP, for given 
seizure prediction horizon. 



sensitivity of a prediction method to three measures characterizing important 
features of any prediction method. 

Relating the maximum false prediction rate to the averaged maximum seizure 
frequency during epilepsy monitoring, only maximum false prediction rates 
lower than 0.15 false predictions per hour respectively three seizures per day 
should be considered. Even if a seizure prediction method achieves a sensitivity 
of 100%, at least 50% of all alarms would be false ones during presurgical 
monitoring. Even worse, patients under normal conditions with three seizures 
per month would endure 97% false predictions. Thus, higher values of the 
maximum false prediction rate are questionable with respect to an application 
in a therapeutical device controlling a seizure. 

Combining these values of FPRmax with a seizure occurrence period of 
half an hour and a very short prediction horizon of 5 seconds, sensitivity of the 
dynamical similarity index yields values between 21% and 42%, which was 
the best result of the three examined prediction methods. The extended version 
of the accumulated energy achieves a sensitivity between 18% and 31%, the 
effective correlation dimension between 13% and 30%. 

The results of the examined prediction methods are significant better than 
the performance of the imspecific random and periodical prediction methods. 




Seizure Prediction Methods 



115 



This indicates that EEG data contain specific, predictive information during 

pre-ictal periods. However, the resulting seizure prediction characteristics are 

not sufficient using it in a therapeutical device to prevent epileptic seizure. 
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Abstract Rapid advances in technology are making the dream of treating human neurolog- 

ical diseases with implanted electronic devices a reality. The more such devices 
are able to exploit the properties of intrinsic neural control mechanisms, the more 
effective they will be in re-establishing control in the setting of disease. Noise and 
time delays are ubiquitous features of the nervous system. Three observations 
suggest that in order to understand control in noisy neural dynamical systems with 
retarded variables it will be necessary to change the focus from the identification 
and characterization of attractors to a study of phenomena that occur near stabil- 
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ity boundaries (i.e., “critical phenomena”): 1) multistability has been identified 
in simple neural loops, the onset of epileptic seizures, and human postural sway; 
2) on-off intermittency and 3) power laws arise in the nervous system. These ob- 
servations support the possibility of developing strategies that treat neurological 
disease by the addition of appropriately designed stimuli, including noise. 



1. Introduction 

Everyday the dream of treating epilepsy by using implanted electronic de- 
vices comes closer to reality. The term "brain defibrillator" has been given to 
implantable devices that detect the occurrence of an epileptic seizure and then 
deliver a stimulus to abort it [Milton and Jung, 2002]. The advantage of such 
devices is that they would be called upon only when needed and thus would 
free patients fi'om the troublesome side effects of anticonvulsant medications. 
Major steps toward the construction of a brain defibrillator have already been 
made; several techniques have been developed that enable seizure occurrence 
to be predicted up to 30 minutes beforehand (this Conference); recent advances 
in technology suggest that the construction of such a device is feasible [Hetling, 
2002 ]. 

Only one question remains: how is the seizure to be aborted? Obviously, 
the more such devices are able to exploit the neural mechanisms involved in the 
generation of epileptic seizures, the more effective they will be in re-establishing 
neural control in the setting of a seizure. 

It has long been known that the application of a brief sensory or electrical 
stimulus just after seizure onset can sometimes abort a seizure within 2 seconds 
of application [Foss and Milton, 2002, Lesser et al., 1999, Milton, 2000, Mo- 
tamedi et al., 2002]. Figure 7.1 A illustrates this phenomenon: a sudden noise 
made 1.4s after seizure onset terminates a seizure within 1.4s [Milton, 2000]. 
A brief electrical stimulus stops 51 % of after-discharges in human epileptic 
brain within 2s; 39 % after 5s [Lesser et al., 1999]. 

These observations are very suggestive of an underlying multistable dynami- 
cal system [Foss and Milton, 2000,Foss and Milton, 2002,Milton, 2000,Milton 
and Foss, 1997]. A schematic representation of a multistable dynamical system 
is shown in Figure 7. IB. The multiple valleys correspond to the basins of at- 
traction for each attractor. Ridges of varying height separate the basins. These 
ridges correspond to the separatrices, or energy barriers. Each of the basins of 
attraction can be assessed by appropriate initialization. A brief stimulus causes 
the dynamical system to switch from one basin to another; in Figure 7. IB from 
the attractor associated with a seizure to one that is not. 

Therapeutic strategies based on the principle of producing a switch between 
two attractors must be performed with great care. A randomly timed stimulus 
has low probability of causing switches [Lechner et al., 1996]. Moreover, the 
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Figure 7.1. A) Stopping a seizure with an auditory stimulus [Milton, 2000]. The average 
seizure duration is determined for 7 1 spontaneously occurring seizures. The length of the auditory 
stimulus shortened seizure corresponds to an average of 69 seizures. B) Schematic representation 
of the energy landscape for a hypothetical multistable dynamical system. 



use of electrical stimuli to abort a seizure is double-edged since an improperly 
timed electrical pulse could itself cause a seizure [Ajmone Marsan, 1972], i.e. 
the very event that the brain defibrillator hoped to prevent! Further complicating 
this approach is that random uncontrolled fluctuations (“noise”) and conduction 
time delays are ubiquitous in the nervous system. The boundaries that separate 
the basins of attraction depend on the time delay and other properties of the 
dynamical system [Pakdaman et al., 1998]. Noise together with delays can 
produce dynamical behaviors that are not observed in dynamical systems that 
do not possess time delays, e.g. statistical periodicity [Milton and Mackey, 
2000 ]. 

Here we review the current state of the art on the effects of brief stimuli and 
noise on the dynamics of multistable dynamical systems with retarded variables. 
It is important to realize that analytical tools for the study of noisy, time-delayed 
dynamical systems are scarce and, in general, even the evaluation of the asso- 
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dated probability densities is uncertain (for discussions see [Guillouzic et al, 
1999,Ohira and Yamane, 2000]). Consequently we discuss these issues in the 
context of simple paradigms in which it has been possible to gain some insight, 
namely, delayed recurrent neural loops and stick balancing at the fingertip. 

2. Mathematical background and outline 

The fact that seizure cessation is not coincident with delivery of the aborting 
stimulus (Figure 7.1 A) implies that memory effects are important. A variety 
of mechanisms can produce memory effects in neural dynamics, e.g. relative 
refractory periods [Milton et al., 1993], activity-dependent changes in vesicle 
pools [Hunter and Milton, 200 1 ] , propagation through a chain of oscillators [Er- 
mentrout and Kopell, 1994,Kopell, 1995]). Here we focus on memory effects 
that arise from conduction time delays. Time delays are intrinsic components 
of the nervous system and arise because neurons are spatially separated and ax- 
onal conduction velocities are finite. Time delays in the central nervous system 
range firom 10-200 ms [Gotman, 1983, Miller, 1994,Milton, 2002]. 

In dynamical systems that possess conduction delays, mathematical models 
take the form of delay differential equations (DDE). Examples include the first- 
order DDEs that arise in the context of neural feedback control mechanisms 
[an der Heiden, 1979, an der Heiden and Mackey, 1982, Mackey and Glass, 
1977, Milton etal., 1989], i.e. 

V{t) + aV{t) = f{V{t-T)) (1) 

where V{t),V{t — r) are, respectively, the values of the state variable (such as 
membrane potential) at times — The time delay is r, a is a rate constant, 
/ describes the feedback, and V is the first differential of V. In order to obtain 
the solution of Equation 1 it is necessary to specify an initial function, (j), on 
the interval [— r, Oj. Multistability, i.e. the co-existence of multiple attractors, 
readily arises in dynamical systems with retarded variables [S. A. Campbell 
et al., 1995, Campbell et al., 1995, Foss et al., 1996,Foss et al., 1997b,Foss and 
Milton, 2000,Losson et al., 1993,Pakdaman et al, 1998]. 

Multistability is most often discussed in the context of a sub-critical Hopf bi- 
furcation shown schematically in Figure 7.2 (other more complex mechanisms 
are possible, for example in bursting neurons [Izhikevich, 2000]). Sub-critical 
Hopf bifurcations characterize the onset of oscillations in neurons [Guttman 
et al, 1980] and of self-maintaining traveling waves, such as spiral waves, in 
model neural networks [Milton et al, 1993]. Figure 7.2 shows the stability dia- 
gram for the occurrence of a sub-critical Hopf bifurcation in a dynamical system 
in the presence of noise. The steady state solutions have been represented by 
the maxima of the stationary density, i.e., the probability density after tran- 
sients have died out [Arnold, 1998,Horsthemenke and Lefever, 1984,Longtin, 
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Figure 7.2. Schematic representation of a sub-critical Hopf bifurcation. The solid lines rep- 
resent the stable solutions and the dashed lines the unstable solutions. This subcritical Hopf 
bifurcation is symmetric about the horizontal axis; we show only the upper half. 



1991,Longtin et al., 1990]. In the hypothetical noise-free, or deterministic, 
case the probability densities are 5-functions. 

For a given choice of parameter, n, the behavior of the dynamical system 
shown in Figure 7.2 depends on the choice of initial condition. In the absence 
of a time delay the initial conditions correspond to the values of the variables, 
V, measured at a single point in time. However, when time delays are present 
the initial conditions take the form of initial functions constructed from the 
variables measured over an interval of time whose length equals the time delay. 
It is not possible to draw the space of all initial functions for a time-delayed 
dynamical system since the system dimension is infinite. Nonetheless we can 
use Figure 7.2 as a “cartoon” to provide a framework for the present discussion. 
When n < A, all initial conditions lead to a fixed point attractor. When n> B, 
all initial conditions lead to a limit cycle attractor. In the region A < fj,< B the 
two attractors co-exist; some initial conditions lead to the fixed-point attractor, 
others to the limit-cycle attractor. The set of initial conditions that lead to a 
given attractor is the basin of attraction for that attractor. In order to be able to 
design perturbations to induce a switch between the attractors it is necessary to 
know 1) the boundaries that delineate the basins of attraction; and 2) the nature 
of the paths that enable a trajectory to leave one basin of attraction and enter 
another. 

Within this framework there are two different explanations for the observa- 
tion that brief stimuli can abort a seizure (Figure 7.1A: 1) changes in neural 
synchrony (fj, > B); and 2) multistability (A < fx < B). Section 3 demon- 
strates that the latency for changes in neural synchrony due to electric stimuli 
and synaptic inputs is typically shorter than the latency seen in seizure abor- 
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tion. Section 4 shows that the latency seen in a stimulus induced switch between 
attractors in a dynamical system is typically greater than r. 

Sections 5 and 6 deal with the confounding effects of neural noise. Noise is 
ubiquitous in the nervous system [Arielle et al., 1996, Verveen and DeFelice, 
1974]. There are two ways in which noise can enter into Equation 1. Additive 
noise (Section 5) refers to the case in which the effect of noise is independent 
of the state of the system, i.e. 



where is, for example, Gaussian distributed white noise. In Figure 7.2, 
additive noise produces vertical fluctuations. Thus noise-induced transitions 
between the attractors are possible if A < // < B; the exact values of /x for 
which noise-induced transitions occur depend on the intensity of the additive 
noise. 

Additive noise-induced switching has been detected in a variety of neu- 
ral preparations. Examples include the switching times in the perception of 
ambiguous figures [Borsellino et al., 1972, Kruse and Stadler, 1995] and the 
detection of weak sensory signals via the mechanism of stochastic resonance 
[Gammaitoni et al., 1998, Moss et al., 1994]. The presence of three scaling 
regions in the two-point correlation function for the fluctuations in the center of 
pressure for postural sway has been suggested to reflect noise-induced switching 
between two periodic attractors [Eurich and Milton, 1996]. Empirical evidence 
for spontaneous switching between attractors in the brains of epileptic patients 
has been obtained from intracranial EEG recordings [Manuca et al., 1998]. In 
these studies, time series analysis suggested that the time variation of the EEG 
signals could be characterized by changes in a single variable. The observa- 
tions were most consistent with a model of bistability in which mesoscopic 
collections of neurons flip between two collective states. 

The second way that noise can enter into Equation 1 is in a state-dependent, 
or parametric, manner (Section 6), i.e. 



where ip{t) is, for example, Gaussian distributed white noise. With respect to 
Figure 7.2, parametric noise produces horizontal fluctuations. Considerations 
of the effects of parametric noise are particularly relevant for neurobiologists 
since neural noise is typically state-dependent. For example, membrane noise 
reflects fluctuations in conductance [Verveen and DeFelice, 1974]. Current is 
proportional to the product of conductance and driving potential: hence the 
effect of the noise is state-dependent. State-dependent noise lies at the basis of 
the spontaneous fluctuations in pupil size [Stark etal., 1958] and in the clamped 
pupil light reflex [Milton et al., 1989]. It also plays an important role in motor 
[Harris and Wolpert, 1998] and balance [Cabrera and Milton, 2002] control. 






( 2 ) 
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Understanding the effects of parametric noise is more complicated than for 
additive noise because the full bifurcation diagram changes. Many of the same 
phenomena that have been observed in the presence of additive noise have their 
counterparts in the presence of parametric noise, for example, postponement 
of Hopf bifurcations [Longtin, 1991,Longtin et al., 1990]. Often the effects of 
parametric noise are more dramatic than those obtained with additive: additive 
noise can destroy a pitchfork bifurcation, parametric noise can destroy a Hopf 
bifurcation [Arnold et al., 1999]. However there are important differences 
between the effects of additive and parametric noise. Some of these differences 
are qualitative in nature: for noise-induced switching between attractors due 
to additive noise it must be true that A < fj, < B; for parametric noise the 
switching can occur even if, for example, ji < A. More importantly parametric 
noise can produce phenomena that cannot be produced by additive noise. One 
such noise-induced phenomena is on-off intermittency discussed in Section 6. 

The presence of multistability suggests that A < /j, < B. In Section 6 we 
show that measurements of the dynamics of neural networks lead to the surpris- 
ing conclusion that for many networks, iJ, is tuned very near the critical point B 
(the * in Figure 7.2). In Section 7 the possibility that self-organized criticality is 
a fundamental property of the nervous system is introduced [Beggs and Plenz, 
2002, Chialvo and Bak, 1999,Kelso, 1999]. Thus changes in neural populations 
at seizure onset may share features similar to those that arise in phase transitions 
of physical systems [Horsthemenke and Lefever, 1984]. This observation has 
important implications for the design of an effective brain defibrillator. 

3. Changes in neural synchrony 

Epileptic seizures involve the synchronization of large populations of neurons 
[Jasper, 1969]; de-synchronization should abort the seizure. In this context 
a stimulating electrode produces an electric field that couples to neurons to 
different degrees related, in part, to each neuron’s physical distance from it. This 
coupling serves to differentially stimulate the neurons and hence desynchronize 
the population [Durand, 1993]. Sensory stimuli could operate in a similar way 
by changing the activity in subcortical nuclei that in turn project diffusely to 
cortical neurons, e.g. the synchronizing effects of thalamo-cortical circuits 
[Contreras et al., 1996]. 

Perhaps the best understood mechanism that gives rise to synchronization 
is phase locking of oscillators. If a population of regularly firing nexirons re- 
ceives input from a periodic source, that population may become entrained, or 
synchronized, to that input if the frequencies are close, or are integer multi- 
ples of one another (n:m phase locking). The periodic input may arise fi-om 
brain rhythms, such as hippocampal theta oscillations, that modulate the firing 
probability of neurons, or from periodic synaptic input to the population. 
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Figure 7.3. Neuronal rate changes produce rapid and large changes in entrainment. A) An 
Aplysia motoneuron (upper panels) is stimulated with a sinusoidal current which induces 1 : 1 
phase locking. B) When a small hyperpolarizing current is applied, motoneuron firing is slowed 
and no longer is entrained in the 1 : 1 pattern with the sinusoid. Note that although a higher order 
locking pattern appears in the motoneuron response during the inhibitory phase, such patterns 
are less effective at generating population synchrony than a 1:1 pattern [Knight, 1972, Hunter 
and Milton, 2003]. C) The same motoneuron as in A. D) In place of the hyperpolarizing current, 
an inhibitory intemeuron is activated which is presynaptic to the motoneuron. This too, slows 
motoneuron firing resulting in a nearly instantaneous loss of synchrony. 



Since the ability of the population to synchronize to the input is a function 
of the firing rate of the population, small changes in firing rate can induce large 
changes in neuronal synchronization [Hunter et al., 1998,Hiuiter and Milton, 
2002, Hunter and Milton, 2003]. Figure 7.3 compares the effect on synchro- 
nization to a periodic input between two different sources of rate modulation: 
1 ) tonic current injection such as one might find with a seizure focus stimulating 
electrode and 2) synaptic stimulation, which might arise when the target site is 
separated fi'om the stimulus site via synapses as in the case of the vagal nerve 
stimulator. 

Figure 7.3A shows a motonexiron from the Aplysia buccal ganglion which 
is stimulated by a sine wave with 1 : 1 phase locking. As such, the neuron is 
highly entrained to the sine wave. At 3.5s, a hyperpolarizing current is applied 
to the motoneuron, slowing motoneuron firing and reducing entrainment. Thus 
tonic current injection is a means to sensitively control population entrainment 
to a coherent periodic input. The change in synchrony is nearly instantaneous. 
While this is certainly a desirable feature in a seizure control methodology. 
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it does not accord with observations such as those in Figure 7.1 A, where the 
response to a perturbation occured more than Is after the stimulation. 

In Figme 7.3B, we use a synaptic input to cause changes in motoneuron firing 
rate. As in Figure 7.3A, the change in motoneuron entrainment to the periodic 
input is nearly instantaneous. In some cases, the motoneuron shows evidence 
of entrainment to the intemeuron as well [Hunter and Milton, 2003], pointing to 
a potential problem of trying to induce desynchronization via a coherent input 
to a neural population. 

These observations suggest that changes in neural synchrony to electrical 
and synaptic inputs typically occiu rapidly. Thus this mechanism does not 
readily explain the magnitude of the latency observed between stimiJus onset 
and seizure cessation. 

4. Multistability in delayed recurrent loops 

Recurrent inhibitory loops play an important role in epileptic seizures aris- 
ing from the amygdala-hippocampal complex [Schwartzkroin and McIntyre, 
1997] and those generated from thalamocortical interactions [Coulter, 1997] 
(Figtxre 7.4A); an excitatory neuron, E, gives off collateral branches that excite 
an inhibitory intemeuron, I, which in turn, inhibits the firing of E [Mackey and 
an der Heiden, 1984, Mackey and Milton, 1987, Milton, 1996, Milton et al., 
1990]. Time delays arise in recurrent loops because of the time taken for 
the inhibitory signal to traverse the recurrent loop. Consequently, mathemat- 
ical models take the form of delay differential equations (DDE). It has been 
shown that multistability readily arises in mathematical models of delayed neu- 
ral recurrent loops [Foss et al., 1996, Foss et al., 1997b] and experimentally in 
electrical circuits [Foss et al., 1997b] and in recurrently clamped neurons [Foss 
and Milton, 2000]. 

The complexities that arise in the use of stimuli to cause switches between 
attractors in delayed recurrent loops can be appreciated by considering a simple 
integrate-and-fire approximation to the recurrent inhibitory loop (Figure 7.4B). 
The membrane potential of a neuron increases linearly imtil it reaches the firing 
threshold at which point the neuron fires, and the membrane potential is reset 
to its resting value. The firing of the neuron excites the inhibitory neuron, I, 
which in turn, at a time r later, delivers an inhibitory pulse to the excitatory 
neuron, E. The advantage of this simple model is that in dimensionless form, 
the dynamics depend only on two parameters: the magnitude of the inhibitory 
pulse. A, and r. Consequently considerable insight can be obtained into the 
behavior of the loop in response to perturbations, including noise. 

The solutions of this integrate-and-fire delayed recurrent loop can be con- 
structed from segments of length t; each segment satisfies an equation of the 
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Figure 7.4. A) Schematic representation of a recurrent inhibitory neural loop. B) The time 
course for the membrane potential (vertical axis) for an integrate-and-fire approximation for the 
recurrent loop shown in A). The horizontal dashed line is the firing threshold and the time delay, 
T, is the time taken for neural activity to traverse the recurrent loop [Foss et al., 1997b, Milton 
and Foss, 1997]. 
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Figure 7.5. The four attractors that co-exist in the integrate-and-fire recurrent inhibitory loop 
when r = 4.1 and A = 0.8. 



form 



r = a; + m + nA 



( 4 ) 



where n, m are positive integers and 0 < x < 1 [Foss et al., 1997b]. For r, A 
fixed, the total number of (m, n) pairs that satisfy Eqxiation 4 is the smallest 
integer greater than r/A. Since the number of such (m, n) pairs is finite it 
follows that all solutions are periodic. When r < 1, only a single attractor 
exists with period 1 + A. Multistability occurs when r > 1. This is because 
the inhibitory pulses are not necessarily the result of the preceding excitatory 
spike. Figure 7.5 shows that four qualitatively different periodic attractors exist 
when r = 4.1 and A = 0.8: {11111}, {00005}, {10220}, {13100}, where 
the notation gives the repeating imit of the periodic solution described by the 
number of inhibitory pulses per interspike interval. 
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Figure 7.6. Structure of the functional space, V’s, for the integrate-and-fire recurrent loop with 
r = 4.1, A = 0.8. This functional space is symmetric about the 45deg line; we show only the 
upper half. 



It is not straightforward to understand how a switch between two attractors 
occurs in a DDE that exhibits multistability. To appreciate the complexities it 
is important to note that a DDE can be thought of as a procedure that maps a 
function, rp, of length r onto a fimction of length r. For the integrate-and-fire 
recurrent loop, the functions, ip, are specified by the times of spike occurrence. 
For r = 4.1, A = 0.8, a sequence of three spike times, ip 3 = {ti,t 2 ,tz} is 
sufficient to initialize the four coexistent attractors [Milton and Foss, 1997]. 
Each Ip leads to a unique solution. As shown in Figure 7.6, the space of all 
possible ipz is partitioned between the four attractors. The integrate-and-fire 
recurrent loop evolves in a functional space constructed from possible ipi,i = 
f j ) ' ’ ' ) 

In general for multistable DDEs, the structure of these basins of attraction in 
i/)-space is exceedingly complex (for examples, see [Bayer and an der Heiden, 
1998, Foss et al., 1996,Losson et al., 1993, Milton and Foss, 1997]. In order 
to understand the effect of introduced perturbations it is convenient to think 
of Ip as being composed of two components: one component represents the 
spike pattern generated as a solution of the delayed recurrent loop; the odier 
component represents the perturbations introduced by the experimentalist (or 
noise, see next section) in an attempt to induce a switch between attractors. The 
problem faced by the experimentalist is to add precisely timed perturbations in 
order to change ip into a new ip that maps to a new basin of attraction. In 
general it will not be possible to accomplish this task by adding a single pulse 
(as occurs in dynamical systems that lack retarded variables [Guttman et al., 
1980,Winffee, 1980]). 

Figure 7.7 shows an example in which the introduction of a single, carefully 
timed inhibitory pulse causes a switch between two qualitatively different spik- 
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ing patterns. The pattern of spiking between the regions designated {11111} 
and {13100} does not correspond to a stable solution of the model and hence 
represents a transient. This transient lasts ~ 2r. Switching times longer than 
T are also observed between attractors in a recurrently clamped Aplysia mo- 
toneuron (Figure 7.8); here the switching time is between r and 2r. 



( 11111 ) ( 13100 ) 




* r 2x 



Figure 7. 7. When r = 4.1, A = 0.8, a carefully timed inhibitory pulse (*) induces a switch 
from the {lllll}-attractor to the {13100}-attractor. This switching is shown for an electrical 
circuit that mimics the integrate-and-fire recurrent loop shown in Figure 7.4 [Foss et al., 1997b]. 
The time after stimulus presentation is shown in terms of r. 

The fact that the switching time is longer than r is easy to appreciate: a 
function of length r is required to define an attractor. One reason that switching 
times longer than r arise is because of transients that arise as trajectory settles 
down onto the new attractor. In the next section we show that a switch between 
basins of attraction typically requires a number of steps. 

5. Noise-induced switching between attractors 

The advantage of studying noise-induced transitions between two attractors 
is that it provides insights into the basic underlying mechanism of how such 
changes occur. Insights into how switches occur in dynamical systems that 
possess retarded variables can be readily obtained from the integrate-and-fire 
recurrent loop. Since in this model only the timing of spikes is important, it is 
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Figure 7.8. When t = 315ms a carefully timed inhibitory pulse (*) induces a switch from one 
attractor to another and then back again in a neural loop constmcted from a recurrently clamped 
Aplysia motoneuron [Foss and Milton, 2000], The time after stimulus presentation is shown in 
terms of r. Clearly it is difficult to determine the exact point in time at which a switch occurs 
between the two attractors and whether or not transients are present. 



equivalent to inject noise into r or A. In the discussion that follows we consider 
the case that noise has been added to A. 

Two steps are involved in producing a switch between two attractors: 1) 
leaving the basin of attraction of the first attractor; and 2) entering the basin of 
attraction of the second. The exiting step can be assessed through measurements 
of the dwell time. The dwell time is defined as the time interval between when a 
trajectory first enters a given basin of attraction to the time that it leaves. In the 
case of the integrate-and-fire recurrent loop a trajectory is said to leave a basin of 
attraction when the number of inhibitory pulses in an interspike interval differs 
fi'om that expected. It has been shown that the fraction of trajectories remaining 
in a given basin of attraction decreases exponentially as a function of time in the 
presence ofnoise [Foss etal., 1997b]. An exponential distribution of dwell times 
is the characteristic distribution observed for the times to cross a threshold in a 
stochastic [Kramers, 1940] or a chaotic [Bauer and Bertsch, 1990,Legrand and 
Somette, 1990, Mackey and Milton, 1990] dynamical system. An exponential 
distribution of times between seizures is observed for patients with medically 
intractable epilepsy who take anti-convulsant medications [Milton et al., 1 987] . 

A more complex question is determining what happens to the trajectory once 
it has exited the basin of attraction: Does it eventually return to the original 
attractor? Does it switch to new attractor? Typically the length of transients 
is long compared to r, and the time between successive noisy perturbations is 
short compared to r. Thus noise-prolonged transients can exist for considerable 
periods of time and represent an example of metastability [Grotta-Ragazzo et al., 
1999]. 
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Figure 7. 9. Schematic representation of the most common path from the { 1 1 1 1 1 }-attractor to 
the {13100}-attractor [Foss et al, 1997b]. 



Figure 7.9 shows the most common path by which a switch occurs between 
the {lllll}-attractor and the {13100}-attractor in the delayed integrate-and- 
fire recurrent loop [Foss et al, 1997b]. The positive subscripts refer to those 
A’s that occur after the trajectory leaves the attractor; the negative subscripts 
refer to those that occur prior to this event. Thus the transients involved in the 
switch between two attractors in this model reflect, in part, the fact that three 
carefully timed A’s are required to effect the switch. 

In general there are an arbitrarily large number of paflis from one attractor 
to another. In the delayed integrate-and-fire recurrent loop, most of these paths 
have a negligible probability of occurring. This is because of the constraints 
placed on more than five A’s and the requirement for many paths for the value 
of A to be more than three standard deviations from the mean. 

However, the situation becomes much more complicated in models with 
continuous feedback [Grotta-Ragazzo et al, 1999]. For example, the condition 
for multistability in the Mackey-Glass equation [Mackey and Glass, 1977], i.e. 



is that T be greater than where a, n, /?, T are constants. A particularly 
convenient case to study noise-induced switching between two attractors is 
when n = 8 [Foss et al, 1997a]: one attractor exists for x > 0, another 
for X < 0 [Losson et al, 1993]. In this case with additive noise numerical 
simulations demonstrate that the switching times between the two co-existent 
basins of attractors are non-exponential and possess long tails with switching 
times exceeding lOOOr [Foss et al, 1997a]! Thus even if time delays are of 
the order of 10 ms, switching times of the order of seconds are possible. Non- 
exponential survival statistics characterize the response of after-discharges to 
brief electrical pulses [Lesser et al, 1999]. 
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6. On-off intermittency: Parametric, or state-dependent, 
noise 

The effects of parametric noise on the dynamics of neural recurrent loops 
have not yet been studied. However, recently important insights into the ef- 
fects of parametric noise on neural control have been obtained from studies of 
stick balancing at the fingertip [Cabrera and Milton, 2002, Cabrera and Milton, 
2003, Mehta and Schaal, 2002], This visuomotor task cannot be performed 
using memorized movement patterns, but reqixires continuous closed-loop con- 
trol. This paradigm has a number of advantages that make it well suited to 
explorations of neural dynamics: 1) the movements of the balanced stick can 
be measured with high precision using non-invasive 3-D motion analysis tech- 
niques; and 2) mathematical models for the control of the inverted pendulum 
with time-delayed feedback have been developed previously [Stepm, 1989]. 
Thus it is possible to directly compare prediction with experimental observa- 
tion. 

The crucial role played by time-delayed feedback in stick balancing is demon- 
strated by the observation that longer sticks are much easier to balance than 
shorter ones. This is because once the stick is sufficiently long, its rate of 
movement becomes slow relative to the time required to make corrective move- 
ments. The noisy fluctuations enter by the movements of the fingertip (hand) 
that act through changes in the pivot point of the inverted pendulum, i.e. as a 
parametric perturbation [Bogdanoff, 1 962, Bogdanoff and Citron, 1965]. Even 
for slow moving sticks the effects of neural feedforward predictive control is 
small: trained subjects are only able to maintain control throughout visual 
blank-out periods that last < 2 — 3t, where r corresponds to the neural latency 
for stick balancing [Mehta and Schaal, 2002]. 

The controlled variable is the vertical displacement angle of the balanced 
stick, 9. Figure 7.10A shows the fluctuations in Az// (equal to cos0 ) as a 
function of time, where I is the length of the stick and A z is the difference in the 
vertical coordinate of the upper and lower ends of the stick. These fluctuations 
exhibit intermittency i.e. intermittently there are large deviations. Dynamical 
systems that exhibit intermittency are often characterized by the presence of 
power laws. Figure 7.10B shows that the laminar phases exhibit a— 3/2 power 
law. The term laminar phase refers to the time interval between successive large 
fluctuations. The laminar phases were measured by first choosing an arbitrary 
threshold (solid horizontal line in Figure 7.1 OB), and then, measuring the time 
intervals between successive threshold crossings in the upward direction (i.e. 
successive corrective movements). The power spectrum of the fluctuations in 6 
(not shown) contains two regions of l//“ behavior: one with a ~ 0.5, another 
at higher frequencies with a ~ 2.5. A —3/2 power law (together with a —1/2 
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Figure 7.10. A) Temporal series for Az/l (equal to cos 0) for a 62 cm stick balanced at the 
fingertip. The horizontal line depicts the threshold position at 1.005 times the mean value. B) 
Log-log plot of the probability that the time between successive corrective movements for stick 
balancing at the fingertip is longer than 5t, P{5t). The latency for stick balancing is ~ 100 ms 
for this subject. Data from [Cabrera and Milton, 2002]. 



power law in power spectra [Venkataramani et al., 1 996]) strongly suggests that 
the fluctuations in 0 exhibit on-off intermittency. 

In deterministic dynamical systems, Type III intermittency produces laminar 
phases with a —3/2 power law [Pomeau and Manneville, 1980]. In stochastic 
dynamical systems on-off intermittency arises from the stochastic or chaotic 
forcing of a control parameter across a stability boundary [Ding and Yang, 
1995,Heagyetal., 1994,Plattetal., 1993]. This power law cannot be reproduced 
using additive noise; it only is seen in the presence of parametric noise. 

The important corollary of the observation of on-off intermittency in a noisy 
dynamical system is that its presence implies that the control mechanism must 
be tuned in parameter space to be close, or perhaps on, a stability boundary. 
Power laws are expected to arise when control mechanisms are tuned at stability 
boundaries [Bak et al., 1988,C!hialvo and Bak, 1999,Kadanoff, 1993, Stanley 
et al., 1998]. Indeed it has been shown that the changes in speed made by 
the hand during stick balancing are Levy distributed and follow a power law 
[Cabrera and Milton, 2003]. These studies suggest that the neural control of 
stick balancing exhibits self-organized criticality. 

It has been suggested that self-organized criticality is a feature of neural 
networks that learn by making mistakes [Chialvo and Bak, 1999]. However, 
the advantages of self-organized criticality for nexual control are not yet clear. 
One possible advantage is that it enables balance control to be maintained on 
time scales shorter than the delay [Cabrera and Milton, 2002]. Indeed, on 
e xa m ining Figure 7.10B we see that > 98% of the time intervals between 
successive corrective movements are shorter than the nexiral latency for stick 
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balancing. In particular, the fluctuations in 6 resemble a random walk for 
which the mean value of 6 is approximately zero, i.e. the upright position is 
“statistically stabilized” [Cabrera and Milton, 2002]. 

7. Self-organized criticality 

Two recent observations suggest that self-organized criticality is not just 
restricted to the control of stick balancing, but may be a fundamental organizing 
property of neural populations. First, the biusting activity of neurons grown in 
culture exhibits scaling phenomena [Segev et al., 2002]. Secondly, and more 
relevant for our discussion, cortical networks in slices of rat cortex have been 
shown to spontaneously evolve into an activity regime that lies on a boundary 
between the extremes of complete randomness and redundant order [Beggs and 
Plenz, 2002]. In these experiments, spontaneously occurring fluctuations in 
the negative peaks of local field potentials were measured in isolated slices 
of rat cortex using microelectrode arrays. Using the same approach for data 
analysis employed for stick balancing, an arbitrary threshold was chosen and 
the time between successive threshold crossings measured. It was found that 
the laminar phases exhibited a —3/2 power law. These authors speculate that 
cortical neural networks exist on the edge between epileptic seizures on the 
one hand, and quiescence on the other. At the very least these observations 
provide a nice explanation of why intrinsic neural mechanisms to stop seizures 
exist even in the non-epileptic brain [Chkhenkeli, 2002,Chkhenkeli and Milton, 
2002 ]. 

8. Discussion 

Is epilepsy an example of a dynamic disease [Milton and Black, 1 995, Milton 
and Jung, 2002]? A dynamic disease is defined as a disease that occurs in an in- 
tact physiological control system that operates in a range of control parameters 
[Glass and Mackey, 1979, Mackey and Glass, 1977, Mackey and Milton, 1987]. 
In other words the pathology of the disease is dynamical rather than struc- 
tural. It is certainly true that the occurrence of an epileptic seizure represents 
a qualitative change in the dynamics of nemonal activity. Qualitative changes 
in dynamics are one of the hallmarks of a dynamic disease. If epilepsy is a dy- 
namic disease, then its treatment should be dynamic, i.e. therapeutic strategies 
must be based on manipulation of the underlying dynamics. The hope is that 
these dynamic strategies can be identified and eventually implemented as the 
treatment arm of a brain defibrillator. 

In dynamical systems with multiple basins of attraction, changes in dynam- 
ics can arise as a consequence of; 1) perturbations that take you fi’om one 
basin to the other; 2) changes in the basins of attraction and their boundaries 
due to changes in control parameters; and 3) changes in control parameters 
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that destabilize the co-existent attractors. Since the control parameters whose 
change heralds seizure onset have not yet been identified we have focused on 
the first two mechanisms for causing switches between basins of attraction. 
From the point of view of the fixed-point attractor in Figure 7.2, the presence 
of multistability implies that the control parameters must be tuned close to the 
edge of its instability. The identification of critical phenomena (see below) in 
neural dynamics strongly implies that the control parameters are not just tuned 
close to the edge of instability, but in fact are tuned at the edge of instability I 
In other words, neural dynamical networks exhibit self-organized criticality 
[Kelso, 1999]. 

Briefiy timed electrical stimuli are also used to control seizures based on 
"control of chaos" techniques [Spano et al., 2002, Schiff et al., 1994]. The 
application of this technique is based on the identification of unstable periodic 
orbits in a time series by the presence of so-called “walking in, walking out” 
phenomena. It is important to realize the underlying dynamic structure for this 
phenomenon, namely a saddle node bifurcation, is precisely the same as that of 
the separatrix that separates two basins of attraction in a multistable dynamical 
system. It has been suggested that it may be very difficult to distinguish between 
a chaotic dynamical system and a noisy multistable dynamical system [Glanz, 
1997]. However, a chaotic attractor exists within a basin of attraction, whereas 
the presence of multistability requires that the dynamical system be tuned close 
to a stability boundary. 

In the presence of noise, dynamical systems tuned close to instability are an- 
ticipated to exhibit a niunber of critical phenomena. There is growing evidence 
that critical phenomena, in addition to multistability, can indeed be detected 
in neural dynamics: power laws [Collins and De Luca, 1994, Fitts and Pos- 
ner, 1973], on-off intermittency [Beggs and Plenz, 2002, Cabrera and Milton, 
2002], noise-induced postponement of bifurcations [Baer et al., 1989,Longtin, 
1991,Longtin et al., 1990, Rinzel and Baer, 1988]; critical slowing down 
[Longtin and Hinzer, 1996, Matsumoto and Kunisawa, 1978], noise ampli- 
fication [Longtin et al, 1990], phase transitions [Haken et al, 1985, Kelso, 
1984, Kelso et al, 1992,Meyer-Lindenberg et al, 2002, Schoner et al, 1986], 
scale-invariant Levy phenomena [Brockman and Giesel, 2000, Cabrera and 
Milton, 2003, Segev et al, 2002, Viswanathan et al, 1996]. Moreover it has 
been demonstrated that self-organized criticality can arise in plausible neural 
networks that learn by making mistakes [Chialvo and Bak, 1999]. 

Traditionally the study of patients with epilepsy has provided the greatest 
insights into the nature of the central nervous system of humans [Jackson, 
1 93 l,Penfield and Jasper, 1954, Morrell, 1985]. The identification of critical 
phenomena in relation to phase transitions (e.g. the transition from liquid water 
to ice) motivated a shift in attention of physicists away from considerations 
of single atoms and molecules to an emphasis on the behavior of networks 
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of molecules [Barabasi, 2002, Kadanoflf, 1993]. Is it possible that a similar 
revolution is about to occur in our xmderstanding of the dynamics of neural 
networks? Do partial complex seizures arise from an epileptic focus or do 
they represent the emergent dynamics of a large distributed epileptic network 
[Chkhenkeli and Milton, 2002] that arise via phase transitions? Thus the search 
for a therapeutic strategy to treat epileptic seizure appears to lie directly on the 
route that may uncover the basic organizational structure(s) of large neural 
networks. All that will be required is a dash of clinical epilepsy mixed with a 
sprinkle of basic neuroscience and a dab of physics! 
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Abstract This work explains the basic principles of diffusion weighted and diffusion tensor 

magnetic resonance imaging (MRI) methods. Both theoretical and experimental 
aspects are included. Scalar measures derived from diffusion tensors including 
new anisotropy measures in terms of entropy are presented. Fiber-tract map- 
ping problem is discussed. Limitations of diffusion tensor MRI are included. 
Directional coherence tensor formalism to smooth the diffusion tensor data is 
presented. A brief discussion of new approaches to diffusion imaging, such as 
diffusion spectrum imaging and high angular resolution diffusion imaging, are 
also discussed. Results from diffusion weighted and diffusion tensor imaging are 
presented from excised rat spinal cord and brain images acquired at 17.6T. 

Keywords: diffusion weighted, diffusion tensor, magnetic resonance imaging, anisotropic 

diffusion, anisotropy 
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1. Fundamentals of Diffusion Weighted Magnetic 
Resonance Imaging 

A typical nuclear magnetic resonance experiment starts with the excitation 
of the nuclei with a 90° radiofrequency (RF) pulse that tilts the magnetization 
vector into the plane whose normal is along the main magnetic field. Spins that 
are initially coherent dephase due to many factors most prominent of which are 
magnetic field inhomogeneities and dipolar interactions. This results in a decay 
of the electromotive force induced in the receiver. Figure 8. 1 shows a spin echo 
experiment [Hahn, 1950] where a subsequent application of a 180° RF pulse 
reverses the dephasing due to inhomogeneities and the signal is reproduced. 
The time between the 90° pulse and reformation of the echo is called TE and 
it is twice the time between the two RF pulses. This echo is detected by a 
receiver antenna and is used to produce spectra. Careful application of magnetic 
field gradients linearly changing in space, eriable the acquisition of magnetic 
resonance images. The details of the techniques employed for spatial encoding 
will not be explained in this work. 

An experiment such as that depicted in Figure 8.1 takes on the order of 
10ms in a typical imaging experiment. In this timeframe the spins undergo 
diffusion that results in mixing, which creates attenuation in the signal amplitude 
when magnetic field inhomogeneities are present. This attenuation can be 
enhanced by the application of so-called diffusion gradients. Figure 8.2 shows 
two diffusion gradients added to the previous pulse sequence for this purpose. 
In this “pulsed gradient spin echo” (PGSE) experiment [Stejskal and Tanner, 
1965] two identical gradients around the 180° RF pulse are applied with a 
time A between them. The duration of these gradients are denoted by 5. If 
G represents the linear magnetic field gradient applied at time ti, after the 
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Figure 8. L Spin echo experiment. 
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application of the gradient, the spins at the position r gain a phase shift 

1i+5 



At a later time t 2 , this phase change can be ‘undone’ by the application of 
the same gradient along the opposite direction, or alternatively, along the same 
direction but after the 180° RP pulse. A spin that moves to a different position 
between the application of these pulses experiences a net phase shift since the 
phase change due to the subsequent gradient, (^> 2 , will not be equal to — . Asa 
result, at a particular position in space, there will be many particles with different 
phases. The signal received from a particular voxel is proportional to the total 
transverse magnetization within that voxel, given by the two dimensional vector 
sum of individual magnetizations (each with a magnetic moment y) 



It is clear that when the phases of the spins 4>n are all the same, the magnitude 
of the detected magnetization, M, is equal to its maximum value of Nfx; where 
when the phases are totally random, it is 0. Therefore, the attenuation in the 
magnitude of the signal indicates the randomness of the phases at a particular 
position which in turn depends on the randomness in the spins’ motional history 
making it sensitive to the incoherent motion of the molecules alone. Diffusion 
weighted MRI exploits this phenomenon to quantify diffusion that occurs within 
the sample. 

The dynamics of the magnetization is governed by the phenomenological 
Bloch Equations [Bloch, 1946]. When diffusion is taken into account, it takes 
the form [Torrey, 1956] 

= -iuoM+ - i'yr ■ GM+ - M+/T 2 + - V • vM+ , (3) 

C/u 

where M+(= + iMy) is the transverse magnetization, wq is the Larmor 

frequency, 7 is the gyromagnetic ratio, T 2 is the spin-spin relaxation time, D 
is the diffusion coefficient and last term, quantifying coherent motion of the 
spins, is included for completeness. From here on, we will call this equation 
the Bloch-Torrey equation. 

For a spin echo experiment, ignoring the last term, Eq. (3) can be solved by 
using the substitution [Stejskal and Tanner, 1965] 




( 1 ) 



N 




( 2 ) 



My{r, t] G) = A{t) exp{—iuJot — t/T2 — ijr ■ F) 



( 4 ) 



where 
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Figure 8.2. Pulsed gradient spin echo experiment. Two diffusion gradients G are applied 
before and after the 1 80° pulse. 



and Q{x) is the Heaviside step function which is equal to unity when its argu- 
ment is positive, and 0 otherwise. 

Inserting Eq. (4) into Eq. (3), the resulting first order differential equa- 
tion yields the Stejskal-Tanner formula for diffusive attenuation [Stejskal and 
Tanner, 1965] 

S = So exp{-'r^S^G^{A - S/3)D) = So exp{-bD) , (6) 

where S := yl(TE), 5o := A(0) and b := — 5/3) is the 6-factor. 

Enhancing diffusive attenuation with the application of gradients introduces a 
different contrast mechanism in magnetic resonance images. These images are 
called diffusion weighted images. Moreover, by repeating the experiment with 
different gradient strengths, it is possible to calculate the diffusion constants. 
The images where the pixels denote the calculated diffusion coefficients are 
called quantitative diffusivity maps. Figure 8.3 shows a diffusion weighted 
image and a quantitative diffusivity map from an excised rat spinal cord. 

2. Diffusion Tensor Magnetic Resonance Imaging 

Diffusion-weighted images have been utilized extensively in the imaging of 
neural tissue since it was shown that ischemic strokes can be detected much 
earlier with diffusion weighted images as compared to traditional T\ and T 2 
weighted images [Moseley et al., 1990]. 

Neural tissue in central nervous system (CNS) is primarily composed of two 
distinct areas: white-matter and gray-matter. White-matter includes the axons 
that transmit signal between different regions. The axons are covered with 
fatty tissue called the myelin sheath that isolates the intracellular space from 
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Figure 8.3. Top row includes three diffusion weighted images from an excised rat spinal cord 
with b- values of approximately 20, 250 and 900s/mm*. The bottom left figure is nondiffusion 
weighted image given by the intercept of the fit. The bottom right image is the quantitative 
diffusivity map where diffusion sensitizing gradient is along frequency encoding direction (left 
to right). 



extracellular space to increase the speed of signal transmission. Gray-matter 
is mostly composed of cell bodies and nonmyelinated axons where the axonal 
architecture is much more complex. 

As can be predicted by visually inspecting the electron microscopic images 
such as those given on pages 52 and 56 of Ref. [Waxman et al., 1995], the dif- 
fusion coefficients along different directions in white-matter are different. This 
diffiisional anisotropy is mostly due to the cellular membranes restricting the 
motion of water molecules, where myelin and the cytoskeleton are contributing 
factors [Beaulieu and Allen, 1994]. 

The signal attenuation in diffiision weighted MR images is caused domi- 
nantly by the diffusion gradients employed. So, the measured diffusion coeffi- 
cient depends on the direction along which these gradients are applied. There- 
fore, by repeating the experiment with gradients applied along many directions, 
it is possible to quantify the diffixsional anisotropy in the voxel. This diffiisional 
anisotropy is presumed to be related to the structural anisotropy in the tissue. 
The anisotropy maps produced can be expected to give high contrast between 
the highly oriented areas such as white-matter and other regions. 
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The fact that diffusion coefficients are anisotropic can be used in yet another 
way. In highly oriented regions, the diffusion coefficient will be largest along 
the direction that water will be least restricted. Figure 8.4 shows the effect of 
sensitizing the signal to diffusional processes along two directions. The top 
row shows images when the gradients are applied along the direction pointing 
perpendicular to the page, where the bottom row includes the images when the 
gradients are applied along up and down direction. When the diffusional process 
is isotropic (as in free water), the signal attenuation is identical. However, 
when there is structural anisotropy, the change in the images with increasing 
b-values depends significantly on the directionality within the tissue. This is 
most apparent in white-matter. Since water diffusion is least restricted along 
the fiber directions, the signal attenuates more rapidly when the gradients are 
applied along those directions as seen in the top row. 

Based on this, the direction along which the diffusion coefficient will be 
greatest can be claimed to give the fiber direction. This is the main hypothesis 
behind fiber-tract mapping using diffusion-weighted MRJ. Starting from a seed 
point selected within the tissue, repeated stepping in the direction along which 
diffusion is fastest will allow the mapping of fibers passing through that point 
[Conturo et al., 1999], [Morietal., 1999], [Basser et al., 2000], [Ozarslan et al., 
2001]. If the main hypothesis of fiber-tract mapping is correct, then this idea 
can be used to map anatomically, hence functionally, connected regions of the 
brain and spinal cord. 




Figure 8.4. Images from an excised spinal cord when the diffusion gradients are applied with 
increasing strength (from left to right). Top row includes images when the gradients are along 
the white-matter fibers, where bottom row includes those when the gradients are perpendicular 
to white-matter fibers. 
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2.1. Diffusion Tensor Imaging and the Fiber Direction 

The most significant quantity needed for fiber-tract mapping is the direction 
along which the fibers are oriented. This direction has been expected to be that 
along which diffusion constant will be the greatest. Fora general experimentally 
obtained angular distribution of diflfusivities D{6, <f>), it can be argued that this 
direction can be taken as that corresponding to the maximum value of D{6, <f>). 
However, in an experiment that produces discrete samples of this distribution, 
if only one of the measurements yields an artificially high diffusion constant, 
then the assiuned direction may be wrong. The weakness of taking the direction 
corresponding to the maximum value of D{0, <j)) as the fiber direction is due to 
its sensitivity to noise. A more appropriate approach would be to fit all samples 
of the data to a model that accounts for anisotropic diffusion. 

Diffusion tensor magnetic resonance imaging (DT-MRI), introduced by Basser 
et.al. [Basser et al., 1994], employs the anisotropic diffusion model previously 
proposed by Stejskal [Stejskal, 1965]. In this approach, the diffusion term in 
Eq.3 was replaced by V • (DVM+), where D is a second rank (order), real 
valued, positive definite, symmetric tensor called the diffusion tensor. This 
change yields the appropriate equation for environments with organizational 
anisotropy such as liquid crystals [Callaghan, 1991]. 

Physically, the diffusion coefficient is related to the variance of the displace- 
ments in an isotropic environment. This is because, in the unrestricted and 
isotropic case, solving a diffusion equation or a random walks approach, lead 
to a molecular displacement profile given by 



Here, P(xo|x, t) is the probability that a particle initially at the position xq 
will end up at the position x at time t. Upon the replacement of the diffusion 
term with the anisotropic diffusion term, as described above, this displacement 
probability profile becomes 




P(xo|x,t) = 



((47Tt)3det(D))^/2 



1 



(x-xo)’^D ^(x-xo) 




4t 



The components of the diffusion tensor in this equation represents a matrix 
proportional to the second moments of the displacements. For a more detailed 
explanation of this, see the appendix. 
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The full solution to Bloch equation with an anisotropic diffusion term during 
the PGSE experiment for transverse magnetization is given by 

= exp iujQt -jr-i-iv [F(t) - 20 (t - 2^) F {^)] 

- 7^ [/o* F^DFdt' - 40 (t - 2^) F 




where 0 is the Heaviside step function, and here F(f) is defined to be 




Note that the last term in Eq.9 quantifies the diffusive attenuation and will 
be our focus from here on. With the definitions S = |M+(T£')| and So = 
|M+(0)| we can easily see that 

£ = exp (^- 7 ^ D dt^ , (11) 

where 

m := F(() - (12) 

In DT-MRI a b-matrix, defined by [Basser et al, 1994] 

pTE 

b:=72/ dt , (13) 

Jo 

is used to calculate an effective diffusion tensor through the equation 

^ — g-trace(bD®*^) ^ 

So 

where it is assumed that diffusion tensor is almost constant in time so that 
Deff ~ D. 

This equation has seven unknowns where one of them is the signal intensity 
if there were no diffusion (5 q) and the remaining six are the imique elements 
of the diffusion tensor. Therefore, seven experiments producing seven linearly 
independent equations are sufficient to determine the diffusion tensor. Note 
that Eq.l4 is the corresponding expression to that in Eq.6 that enables the 
calculation of diffusion coefficients. Figure 8.5 shows the components of the 
diffusion tensor of an excised rat brain. 
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Figure 8.5. Diffusion tensor calculated from a series of diffusion weighted images. The order 
of the images is Dxx^ l^xz^ ^yxi ^yyt l^yz^ ^zxi ^zy^ ^zz from lefr to right and top to 
bottom. The absolute values of the offdiagonal elements are taken since they can be negative 
valued. 



So we have two experimental schemes. In one of them, we can apply dif- 
fusion gradients along certain direction g with differing gradient strengths to 
calculate the apparent diffusion coefficient jD(g) from Eq.6. Here g is a unit 
vector representing the direction of the diffusion gradient, i.e., G = Gg. In the 
second scheme, we can apply the diffusion gradient in at least six noncoplanar 
directions and with differing gradient strengths and calculate the apparent dif- 
fusion tensor (D) by using Eq.l4. An important question to be addressed is 
how these results are related. If the background (imaging) gradients are small 
compared with the diffusion gradients, which is almost always the case, then 
comparison of Eq. 6 with Eq. 14 yields the simple relation [Hsu and Mori, 
1995] 



I>(g) =g^Dg . 



(15) 
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Figure 8. 6. Eigenvalues of the diffusion tensor in decreasing order. 



In addition to its great reduction in the number of experiments required to 
quantify three dimensional distribution of diffusivities, DT-MRI also makes it 
possible to quantify the direction along which diffusion coefficient is greatest. 
This is done by finding g that maximizes D. Once the diffusion tensor is 
calculated, this direction is just the eigenvector corresponding to the largest of 
the eigenvalues of D. 

3. Scalar Measures Derived From DT-MRI 

Similar to what was presented for the case of diffusion weighted imaging, 
fitting the data to the Stejskal-Tanner relation for anisotropic diffusion, Eq.l4, 
produces an almost nondiffusion weighted image which is calculated fi'om the 
intercept of the fit. The intercept does not have any diffusion weighting due to 
the gradients used in the pulse sequence since it corresponds to the signal at 
6 = 0. Therefore, the only diffusion weighting comes from the magnetic field 
inhomogeneities within the sample. This reduction in the diffusion weighting 
in the MR images can only be obtained by the quantification of diffusion. 

A diffusion tensor has 9 components. However since it is a symmetric tensor, 
6 of these have distinct values. Also because it is symmetric, diffusion tensor 
is diagonalizable. Its spectral decomposition is given by 

D = Ai uiUi + A 2 U 2 U 2 + A 3 U 3 U 3 , (16) 

where Aj are the eigenvalues and Uj are the eigenvectors. Images of the three 
eigenvalues of the diffusion tensor calculated on an excised rat brain is given 
in Figure 8 . 6 . Upon rotations, the eigenvalues will remain constant where 
the components of the eigenvectors will change. Therefore, an (rotationally) 
invariant basis of the diffusion tensor has three elements. These three elements 
can be chosen to be the three eigenvalues. However, it is not easy to interpret 
the meaning of these values perhaps except the principal (greatest) eigenvalue, 
which is just the diffiisivity along the fiber direction. Another choice could 
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Figure 8.7. Nondiffusion weighted image mean diffusivity (middle), and fractional 
anisotropy (right) images from the excised rat brain. 

be the first three moments of the diffusion tensor, namely trace(D), trace(D^) 
and trace(D^). All of these indices, like the eigenvalues, have units. The 
first moment of the diffusion tensor, giving the average of diffusivities along 
all directions, has been chosen to be the first scalar measure derived from the 
diffusion tensor. This quantity, called mean diffusivity [Basser, 1995] is given 
by 



Traditionally, it has been found useful to have unitless scalars instead of 
the second and third moments. Especially an index that measures the level 
of organization in a pixel would be very useful since it would make it easy 
to distinguish highly organized regions, like white-matter, from others. There 
have been several suggestions for the parametrization of anisotropy in the tissue. 
The most commonly used one is Fractional Anisotropy [Basser, 1995], which 
is expressed in terms of the eigenvalues as 



FA takes the value of 0 for totally isotropic tensors whereas it takes the value of 
1 in the complete anisotropic case. As for the third invariant, a scalar measure 
such as skewness [Bahn, 1999] can be used. However, in neural tissue the third 
invariants have not found widespread utilization. In Figure 8.7 we show the 
nondiffusion weighted image of the rat brain calculated fi’om the intercept of 
the fit along with mean diffusivity and fractional anisotropy maps. 

3.1. Anisotropy in terms of Entropy 

An important matter in fiber-tract mapping is choosing an appropriate cri- 
terion to terminate the tract tracing process. Traditionally, this criterion has 




(17) 




(18) 
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been the fall of anisotropy below a prespecified value. The idea behind this 
is that in low anisotropy regions, the fiber direction becomes too uncertain to 
continue tracking. This idea in effect states what is expected of an anisotropy 
measure. Namely, anisotropy should quantify the level of orientational cer- 
tainty contained in the diffusion data. Or equivalently, isotropy can be defined 
as the level of uncertainty. With this in mind, we take the orientation specified 
by the solid angle (corresponding to the polar angle 6 and azimuthal angle (p) 
as a random variable and define the isotropy in terms of the differential entropy 
given by 

o /’27T /*7r/2 

a:=—^ d(f> desmeDN{O,(f>)\nDN{0,<f>) , (19) 

27T Jo Jo 

where Dn the ‘normalized’ diflfusivity given by 

/ o /•27 t /*'7t/2 

^ d<j) j desmeD{e,(t>) . ( 20 ) 

27 t Jo Jo 

Integrations are carried out over the hemisphere because dififusivities have an- 
tipodal symmetry, therefore directions correspond to rays rather than vectors. 
Eq.l9 is the same expression with the Boltzmann-Gibbs entropy, which quan- 
tifies the uncertainty in a phase space taken here to be the projective 2-sphere 
and £>j\r(0, (/>) is the probability function. 

3.2. Anisotropy of a Diffusion Tensor 

Having determined a measure of certainty in a given diffusivity profile, we 
now find a corresponding relation to Eq.l9 from the diffusion tensor. This 
can be done by discretizing D{g) in Eq.l5 and using discretized versions of 
Eqs. 19 and 20. However, this would require too many computations, and the 
differential entropy has the undesirable feature that the entropy calculated from 
samples of a continuous distribution does not converge asymptotically to the 
entropy calculated fi’om the continuous distribution. Therefore, we look for an 
expression that is calculated directly firom the tensor. To do this, we realize that 
the trace of a 3 x 3 matrix A is given by 

2 /*27T ^7t/2 

trace(A) = — y dcp J sin 0 A , (21) 

where = (sin0 cos(p, sin0 siri(j), cos 9). We note that this is the same 
expression as the denominator in Eq.20 when D{9,<p) is replaced by the quadratic 
form in Eq.l5. So, Eq.20 turns into 
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where 



_ D 

^ trace(D) 



(23) 



Note that pis a symmetric positive definite matrix with trace(p) = 1. Inserting 
Eq. 22 into Eq.l9, we get 



a 







define In pV»e,<^) 

d6smdxj)l^p\npxl}e,i, , 



(24) 



where the approximation used is the same as the classical approximation to 
quantum entropy [Wehrl, 1978]. 

Using Eq.21 once more, we get the simple expression for the entropy 



3 

cr(p) := -trace(p Inp) = - Inpi ~ <r , (25) 

i=l 



where pi are the eigenvalues of p. Note that this is the same expression as von 
Neumann entropy in quantum statistical mechanics (and quantum information 
theory) quantifying the uncertainty level (or the amount of lack of information) 
contained in a density matrix [Pick and Sauermann, 1990]. 

Since isotropy is a measure of uncertainty, we define anisotropy to be a 
function monotonically decreasing with increasing entropy. Two such indices 
that we propose are: quantitative anisotropy that quantifies the orientational 
information content of a diffusion tensor given by [Ozarslan and Mareci, 2003a] 



QA = In 3 — O’ , 



(26) 



and visual anisotropy that gives better contrast in calculated images given by 



VA:=J2 + ^-^— . (27) 

V 21n3 ^ 

The formulations of the anisotropy indices QA and VA as in Eqs. 26 and 
27 enables one to visualize the behavior of these indices for given values of 
eigenvalues of p. If we order the eigenvalues such that pi > p 2 > P 3 , then 
this condition along with pi + p 2 + P 3 = 1 and the positive definiteness of the 
eigenvalues, limits the allowed values of pi and p 2 to the interior of a triangular 
zone in the pip 2 plane given by: 



Pi - P2 > 0 
Pi + 2p2 > 1 
Pi + P2 < 1 • 



( 28 ) 
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P, 



Figure 8.8. The behavior of entropy and anisotropy indices for all possible eigenvalues of the 
normalized diffusion tensor. 



Since ps is uniquely defined by a given set of pi and p 2 , these anisotropy 
measures can be visualized within this triangle. Figure 8.8 shows these maps 
for these two indices along with von Neumann entropy and its 8^^ power. 

Figure 8.9 shows the calculated maps of these indices for a coronal rat 
brain slice. As expected, regions corresponding to white-matter are bright 
in anisotropy maps, and dark in entropy maps making them a suitable choice 
as a thresholding parameter in fiber-tracking. 

The inflation in the number of anisotropy indices that have been proposed 
in the literature is a serious problem because given two voxels, one index may 
identify the first one as more anisotropic, whereas another may identify it as 
more isotropic. This makes the comparative analyses of the anisotropy values 
dependent on the choice of the anisotropy index. The anisotropy indices pro- 
posed previously lack the meaning that we have given to it, namely, a quantity 
increasing with increasing information content. Moreover, FA index that we 
have mentioned above can be expressed in terms of the normalized diffusion 
tensor p we defined: 



FA = 




trace(p2) 



(29) 
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Figure 8.9. Images formed by calculating the scalar indices introduced in the text for a trans- 
verse excised rat brain slice. Top left is the entropy, bottom left is its 8^^ power, top right is QA, 
and bottom right is VA. 




Figure 8.10. White matter fiber tracts of normal (left) and injured (right) rat spinal cords. 
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So, just like V A is a scaled version of trace(p In p), FA is a scaled version of 
trace(p^). This is just the purity index in the information theory and statistical 
mechanics literature and is known to be a good measure to decide whether or 
not the distribution is pure, while it violates the properties that a valid measure 
of information should possess [Wehrl, 1978] [Pick and Sauermann, 1990]. 

4. Fiber-IVact Mapping in Neural Tissue 

Once the diffusion tensor is calculated, the scalar index that will be used as the 
termination criterion, as well as the eigenvalues and eigenvectors, are calculated. 
The fiber tracts are constructed by repeatedly stepping in the direction along 
which the diffusion coefficient is greatest, which is given by the eigenvector 
corresponding to the largest of the eigenvalues. The constructed fiber-tracts 
can be thought of as curves whose tangent vector is set equal to this direction. 
The equation of motion in this case is given by a Frenet formula [Basser et al., 
2000 ]: 




(30) 



where r is the position vector, ds is the infinitesimal scalar distance on the curve, 
and e> is the unit vector from diffusion data. This equation can be solved by 
integration with user supplied initial condition. 

Excised spinal cords of one normal and one injured female Sprague-Dawley 
rats were imaged with a diffusion weighted spin-echo pulse sequence at 17.6T 
using a Bruker Avance imaging and spectrometer system. Diffusion gradients 
were applied along seven directions: Cx, Cy, Oz, + ^y), "^i^x + ^z), 

+ ez)» + ® 2 / + ®z) with four different strengths 0, 180, 360 

and 5A0mT/m. Imaging parameters were: TR = 1000ms, TE = 27.1ms, 
A = 17.8ms, S = 2.4ms, resolution= 40 x 40 x 250pm^. Fibers were 
calculated from selected regions of interest (ROIs) which specify the initial 
values of Eq. 30. Tracking is terminated when anisotropy drops below a 
certain prespecified value. Figure 8.10 shows an example of calculated fibers 
from normal and injured spinal cords when the ROIs are selected in the dorsal 
column. The injury site suffers a discontinuity in the fibers. 

Also imaged was an excised rat brain with b-values of approximately 100, 
500, 1000s /mm?. Imaging parameters were: TR = 3060ms, TE = 28.8ms, 
A = 17.8ms, 5 = 2.4ms, resolution= 117 x 117 x 270)um^, and matrix size 
was 128 X 128 x 78. The seed points were selected in corpus callosum and 
cerebral peduncles and fibers were tracked in orthograde as well as retrograde 
directions. These images are shown in Figure 8.11. 
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5. Problems of DT-MRI Based Fiber Tracking 

Despite the fact that only limited quantitative verification of the DT-MRI 
based fiber-tract mapping results has been done, qualitative comparisons with 
known anatomy of the nervous tissue has shown that DT-MRI is able to correctly 
map the major axonal pathways [Ozarslan et al., 2001]. Despite the promising 
results achieved by DT-MRI, it is known that it has significant weaknesses. 

One of the problems with DT-MRI is that the formalism presented to cal- 
culate the difiusion tensor firom signal attenuation does not account for spatial 
dependence of the diffusion tensor within a voxel. It is the correct description 
for a medium that is homogeneously anisotropic such as liquid crystals. It is not 
known how spatial dependence of the tensor would affect the measured signal 
attenuation. 

Another problem with the formalism just presented, is that it assumes no 
boundary conditions, hence assumes fi’ee diffusion and does not account for re- 
stricting boundaries. These problems are present in diffusion weighted imaging 
as well. Therefore, the word ‘apparent’ has been used before the phrases ‘diffu- 
sion coefficient’ and ‘diffusion tensor’ to emphasize that calculated diflfiisivities 
are in reality dependent on the parameters of the pulse sequence used. 

DT-MRI based fiber tracking assumes that there is a single fiber direction 
within a voxel. A typical axon has a diameter in the order of lO/itm and typical 
voxel volume of the images we acquire at our institution is in the order of 
(lOO^um)^. So, in every voxel, there is a bundle of axons. It is known that in 
many areas of the brain, these axons cross. This may result in deviations in the 
calculated directions and premature termination of tracking due to a decrease 




Figure 8.11. White matter fiber tracts in the corpus callosum (left) and cerebral peduncles 
(right) in a rat brain. 
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in the anisotropy value. This effect will be more serious in clinical imaging, 
where voxel volume will be in the order of 1 mm^ . Similar problems may occur 
even when there is a single fiber bundle in the voxel with a curvature in the order 
of l/d or higher, where d is the length of the edge of a voxel. 

5.1. A Post Processing Algorithm for Diffusion Tensor 



The relatively less demanding nature of DT-MRI, has directed some groups to 
overcome the difficulties by designing post processing algorithms that improve 
the fiber tract mapping process. As an example we present our directional 
coherence formalism [Ozarslan and Mareci, 2002]. 

Diffusion-weighted imaging based fiber-tract mapping has been thought of 
as a technique to be applied to white-matter because of the high certainty in 
the fiber direction. However, we have managed to map structure in spinal cord 
gray-matter by two means, first by defining a scalar index called directional 
coherence (DC) that quantifies how consistent DT-MRI based fiber direction 
in a voxel is with its neighboring voxels. Second, we smooth the diffusion 
tensor image in such a way that the smoothed tensor field will have higher 
orientational certainty. Diagonalization of the new tensor yields more robust 
tracking making it possible to see the structure in gray-matter. 

Calculation of DC proceeds as follows. First diffusion tensors are diagonal- 
ized and the primary eigendirection is called u. Then a matrix T is defined to 
be the outer product of this vector with itself, i.e. To := fiu^. Next, this tensor 
image is convolved with a Gaussian kernel of standard deviation s given by 



Obviously if the voxel is in a perfectly coherent environment, the convolution 
step will not change the matrix, i.e. Tg will remain equal to To which has the 
highest possible value of 1 for VA. Therefore, starting with a unidirectional 
matrix, Gaussian convolution ‘contaminates’ its purity by its environment, and 
reduced anisotropy is quantified as the level of coherence in the local environ- 
ment. 

Next, we smooth the diffusion tensor field D based on the idea that given two 
voxels, the ‘interacting’ quantity between the diffusion tensors from each voxel 
should be their quadratic form along the direction connecting the two voxels. 



Images 




(31) 



(32) 



(33) 
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Figure 8.12. On the left is the visual anisotropy of the diffusion tensor VA(D), next to it is 
VA(Tq^ °) which is ameasure of how coherentthe fibers in aparticular voxel with its neighbours 
is. Finally on the right is the angle by which the principal eigenvector of the diffusion tensor is 
changed as a result of the smoothing algorithm. 



A simple implementation of this idea is to define a connectivity vector by 
the equation 

1 ( + \ 
cijk = 5 ±y’’(Aj+ik + A>ik)y . (34) 

V ±2"’(AWi + Ajk-i)« I 

where ± signs in front of each of the components indicate the uncertainty in the 
sign at this point. Next, we calculate the outer product of this vector with itself 
and do a Gaussian convolution and define the directional coherence tensor as 

= (35) 

where is the same with Cyk with the combination of signs that maximizes 

trace(TpQ D). Note that VA(TpQ) is the directional coherence of the new 
tensor field. 

Figure 8.12 shows the scalar indices that are produced from the algorithm. 
The figure on the left is just the Visual Anisotropy image of a slice from the 
excised rat spinal cord. The image in the middle is the VA of the directional 
coherence tensor from this slice, which is the coherence in the smoothed tensor 
field. As it is obvious from the comparison, unlike the VA of the diffusion 
tensor this image contains high contrast within the gray-matter. Although the 
diffusion tensors in these bright pixels in the gray-matter are quite isotropic, they 
are coherent directionally. The image on the right depicts the angle by which the 
presumed fiber directions are changed as a result of the algorithm. It is clear that 
the algorithm does not affect the highly anisotropic and highly coherent areas 
in white-matter, whereas there are changes in gray matter. In nearly isotropic 
areas, the directions when chosen to be the eigenvector corresponding to the 
largest of the eigenvalues, have the danger to be false as much as 90° due to 
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Figure 8.13. These images show the change in the calculated fiber-tracts. The first of these 
show the erratic trajectories in gray-matter before smoothing, and last two show the calculated 
fiber-tracts from the smoothed tensor image. 



the bias in the sorting of the eigenvalues. The very bright pixels in gray-matter 
corresponds to the correction due to this kind of errors. 

The fiber-tracts are calculated for the excised rat spinal cord as can be seen 
in Figure 8.13. The left figure shows the erratic trajectories when the ROI is 
chosen in gray-matter. The figures in the middle and on the right show the 
calculated fibers after the algorithm is applied. The structure in gray-matter 
is significantly improved. These results are consistent with the directional 
coherence map because the results of the fiber-tract mapping algorithm show 
coherent fiber bundles in the areas where directional coherence values were 
high. And the dark area in the center correspond to fiber crossings in the 
calculated fiber-tracts. 

5.2. New Approaches to Diffusion Imaging 

Recently, there has been attempts to replace the tensor model with more so- 
phisticated models. According to one approach [Wedeen et al., 2000], q-space 
imaging methods (see appendix for a brief discussion) have been applied to cal- 
culate displacement profiles of molecules without satisfying the requirement for 
the narrow pulse approximation assumed in the q-space formalism [Callaghan, 
1991]. This method called ‘diffusion spectrum imaging’ is unlikely to be used 
clinically because of very demanding hardware and acquisition times necessary 
[Basser, 2002]. 

Another approach to diffusion imaging is called ‘high angular resolution dif- 
fusion imaging’ [Tuchetal., 1999]. In this method, the same gradient strength is 
applied along many directions to calculate a diffiisivity profile D{g) modeling 
three dimensional diffusion by a series of one dimensional diffusion processes. 
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This approach was further improved, where the spherical harmonic transform 
of this diffusivity profile is taken, and the resulting Laplace series is terminated 
so that it will include only the significant terms in the expansion [Frank, 2002]. 

We have further improved this approach by writing a corresponding Bloch- 
Torrey equation that includes a Cartesian tensor of rank higher than 2 [Ozarslan 
and Mareci, 2003b]. In this case, the corresponding Stejskal-Tanner relation- 
ship is derived allowing the evaluation of the components of higher rank tensors 
by using least squares fits. This makes the evaluation of the spherical harmonic 
transform unnecessary. The calculation of anisotropy and fiber directions nec- 
essary for fiber-tract mapping remains an open area of research for these new 
techniques. 
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Appendix: Diffusion Tensor and Displacement Profile 

In this appendix, we try to develop an intuitive understanding of what dif- 
fusion tensor imaging really performs. To do this, we give a brief review of 
Markoff’s method applied to the problem of random flights and relate it to 
another MR diffusion imaging modality called q-space imaging. 

Assume that we have a single particle undergoing a series of random flights 
[Chandrasekhar, 1943] where at each step, the particle moves a distance rjt = 
{xk, Vk, Zk)'^ with a probability distribution function Tk{xk,yk,Zk)- Then the 
probability of finding the particle at the position R = Ylk=i relative to the 
initial position is given by 



We will be interested in the bahaviour of this general solution under two 
conditions: N oo and probability function Tk is identical at each step, i.e., 
T = Tfc for all k. If we further assume that all first moments of displacements 
vanish, we get the expression: 




(8.A.1) 



where 




(8.A.2) 




(8.A.3) 
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where 




(8.A.4) 



and components of this matrix are given by 



Sij = {viTj) = N dr TiTj r(r) . 



(8.A.5) 



This yields a probability distribution 



Wn{R) = (87r3det(S)) ^ exp S~'^R 



(8.A.6) 



Now we look at what happens to the signal when the gradient pulses are 
muchshorterthantheintervalbetweenthem, i.e.,5 <C A in a PGSE experiment. 
Under this condition, all the diffusive processes occur in the absence of diffusion 
gradients. Therefore, it is possible to quantify the signal attenuation by the 
relation [Callaghan, 1991] 



where Ps(r|r', A) is the probability for a particle initially at r to end up at r' 
after time A, and p{r) is the probability of finding the particle at r. Assuming 
a spatially homogeneous P 5 (r|r', A), this equation can be written in terms of 
the dynamic displacement R := r' — r. 



Therefore, the propagator Ps(R, A) is just the Fourier transform of the signal 
attenuation. With the definition q := 'y6g/{2n), utilization of Eq. 8.A.8 to 
calculate P 5 (R, A) is called q-space imaging. 

Since Ps(R, A) is physically the same function as Wn{R) with the appro- 
priate change of variables, so should their Fourier transforms. Therefore, the 
signal attenuation observed in a diffusion tensor experiment when 5 ^ A is 
equal to An given by Eq. 8. A. 3. Therefore, 



Using this along with A := 7 ^g yields the simple relation [Ozarslan and Mareci, 
2003c] 





(8.A.8) 



-^A^SA = -7V<5"Ag^Dg. 



(8.A.9) 




(8.A.10) 
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Therefore, the calculated diffusion tensor, if the condition 5 A is satisfied, 

is a matrix of second moments of the displacement divided by the diffusion 

time. And, DT-MRI can be thought of as a method that attempts to calculate 

this matrix without necessarily fullfilling the assumptions of q-space imaging. 
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Abstract The long-term properties of the Short-term largest Lyapunov exponents (STLmax) 

of EEG have been used successfully in epilepsy seizure prediction. STLmax 
profiles also show short-term patterns characterizing seizures that can be used 
for detection purposes. In this paper, we explore two such properties shown by 
the STLmax data during seizures and develop and compare automatic seizure 
detection algorithms on over 1 ,000 hours of data. 

Keywords: STLmax, seizure detection, epilepsy. 
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1. Introduction 

The design of automated seizure detectors has been addressed in the liter- 
ature with many different methodologies [2, 3] and with varying degrees of 
success. An automatic semire detector is useful in situations like long-term 
patient monitoring, clinical characterization of epilepsy, and even automated 
drug delivery. It is desirable to have an algorithm that is online, completely 
automated and patient independent. 

The tasks of timing the onset, seizure identification and classification of 
seizures are highly subjective to electroencephalographers. This variability at- 
tests to the difficulty of the task. It is therefore no surprise that automated 
seizure detection algorithms are plagued by false alarms and missed detections. 
All of the previous attempts have utilized the EEG signal directly [2, 3]. In 
this paper we will investigate a dynamical pre-processor based on the short 
term Largest Lyapimov exponents (STLmax), which has been shown useful to 
predict epileptic seizures [7]. The largest Lyapunov exponent is computed in 
windows sufficiently long to obtain a stable estimate, but short enough to track 
the dynamical changes imderlying neuronal populations, just like windowed 
FFTs can track spectral changes in time series (hence the name STLmax). Es- 
timation of STLmax on every successive non-overlapping EEG window yields 
a new time series that can be used to build automated seizure detection algo- 
rithms. The hope with this dynamical preprocessor is that the many artifacts 
plaguing the EEG will be easier to discriminate, preserving the detectability of 
the seizure events and yielding better detectors. A shortcoming that we have 
to live with is the degradation in temporal resolution produced by the STLmax 
estimation. The paper is organized as follows; We start by briefly reviewing the 
STLmax algorithm that builds the dynamical pre-processor, and then illustrate 
STLmax features associated with several types of epileptic seizures. Following 
which we propose two methodologies to build detectors, and compare their 
performance on an extensive data set comprised of over 1000 hours of at least 
26 channels of electrocorticographic (ECoG) data of 6 patients with temporal 
lobe epilepsy. 

2. Nonlinear Dynamic Preprocessing 
2.1. STLmax: Brief Introduction 

In a series of papers [4, 5, 7], the Brain Dynamics Group at the University of 
Florida has proposed to model the transition from normal to epileptic state as a 
dynamical transition of the brain from a chaotic to a more ordered state [5, 7]. 
Since we do not possess a formal description of brain dynamics, dynamical 
parameters must be estimated directly from data. The Largest Lyapunov expo- 
nent (LLexp) appears as a good compromise between a sensitive description of 
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relevant dynamical properties (as the transition from a lesser to a more ordered 
state) and effective estimation from the EEG. But since the brain, modeled 
as a dynamical system possesses time varying parameters, the conventional 
method of estimating LLexp [8] must be adapted to epileptic EEG analysis. 
The short-term largest Lyapunov Exponent termed STLmax has been proposed 
by lasemidis and Sackellares as a means to quantify the evolving brain dynam- 
ics [4]. We briefly review the algorithm in the sequel. If we denote by the 
estimate of the short-term largest Lyapunov exponent STLmax in window 
[4, 8] then 






1 Vi 



( 1 ) 



where 



■ + (i - 1)A^^ -f {j - 1)A^^ with i e {l,Na] and 

3e[l,Na] 

■ At** is the evolution time for If the evolution time At^ is given in 

seconds, then is in bit/sec 

■ tp i® the initial time point of the fiducial trajectory 

■ X{t^) is a vector of the fiducial trajectory 0t(X(t§)), where = tQ + 
(i - l)At*^, X(t§) = [x(t§), . . . ,x(f§ + {p- l)r)]^, and X(t^) is a 
properly chosen vector X (t^) in the phase space 

■ SX^j (0) = X{t’-) — X (tj) is a perturbation of the fiducial orbit at and 

SX^jiAt'^) = X(t^ + At'=) - -I- At*) is the evolution of (5X^(0) 

after time At* 



■ Na is the necessary number of iterations for the convergence of the L* 
estimate for the data segment of N points (absolute time dmation T) 

■ If Atfe is the sampling period of the time domain data, then T = {N — 
l)At* = iVaAt* - (p - l)r. 

The phase space reconstruction is done with an embedding dimension p of 
7 and delay r of 7 samples. The values of At* and T are fixed at 20 points & 
10.24 seconds respectively [4]. STLmax values are calculated for each channel 
independently with a non-overlapping window of 2048 ECoG samples, which 
makes the time scale of each STLmax point equal to 10.24 seconds. 



2.2. Data Set 



The development and validation of the epileptic seizure detector are con- 
ducted on a set of 6 patients with temporal lobe epilepsy (TLE). All the patients 
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are immune to drugs and are candidates for brain surgery. The ECoG (Electro- 
CorticoGram) is recorded from depth electrodes as shown in Figure 9.1. Each 
circle represents a single electrode contact. The location of depth electrodes and 
subdural electrodes is depicted schematically. Ar and Al represent subdural 
electrode strips placed under the right and left orbitofrontal cortex. Br and Bl 
are subdmal strips that are placed luider the inferior temporal cortex. Cr and 
Ci are multi-contact depth electrodes inserted stereotactically in the right and 
left hippocampi respectively. 



Figure 9.1. Placement of electrodes used to record the ECoG in the 6 patients considered in 
this work 



The ECoG data from the electrodes is first digitized at 200 Hz with 10 bits 
precision. It is then band pass filtered at 0. 1-70 Hz and recorded into VHS tapes 
synchronized with video recordings. Over 1000 hours of data are analyzed in 
this work from 6 patients with different types of seizures as summarized in 
Table 9.1. 

Even in the case of implanted electrodes, data corruption can occur. For 
example, the wire connecting the electrode to the recording device can be loose 
or broken. The corrupted portions of the data are removed instead of being 
replaced with zeros, because zeros misrepresent the data. Of course concate- 
nation also misrepresents the data as the adjoining segments may be in totally 
different states, though it makes the data seem continuous. We choose concate- 
nation as it gives better results for the methods we use. Except for patient 4, 
the data set is continuous for all the patients. 



I 
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2.3. Properties of STLmax 

The data set contains four different types of seizures as scored by 2 indepen- 
dent neurologists [7]: 

1 PSG; Complex Partial evolving to Secondarily Generalized (Figure 9.3) 

2 CP: Complex Partial (Figure 9.4) 

3 SC: Sub Clinical (Figure 9.5) 

4 SP: Simple Partial (Figure 9.6) 

The STLmax profiles show 2 patterns pertaining to all these types of epileptic 
seizures. They are as follows: 

1 A drop in the STLmax mean value with respect to the preictal state, 
corroborating the hypothesis that the brain dynamics become more or- 
dered [4]. It is interesting that this feature (a drop in value) is very 
different from the observed amplitude of the EEG, which normally in- 
creases during seizures. This drop can be used to detect seizures by a 
simple threshold mechanism. 

2 Entrainment between the STLmax of different channels before a seizure [5] . 
In signal processing terms, this means that the “distance” between the 
STLmax trajectories in phase space will decrease when an ictal event is 
pending. 

Figure 9.2 shows a typical seizure from two focal electrodes (Right Temporal 
Depth) where these two characteristics are clearly visible. However, it should 



Table 9. 1. Patient details. 



Patient 

# 


Sex 


Age 


No. of 
Channels 


Length of STLmax 
Data 

No. of 
Hours Breaks 


CP 


No. of Seizures 
Type 

PSG SC SP 


Total 


1 


F 


41 


30 


-211 


2 


19 


3 


1 


0 


23 


2 


M 


29 


28 


-146 


0 


8 


0 


10 


1 


19 


3 


F 


38 


32 


-22 


0 


8 


0 


0 


0 


8 


4 


M 


60 


28 


-287 


154 


0 


7 


0 


0 


7 


5 


F 


45 


26 


-87 


4 


3 


0 


6 


0 


9 


6 


M 


19 


28 


-321 


6 


2 


8 


7 


0 


17 




Total 




-1075 


166+5 


40 


18 


24 


1 


83 
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minutes 

Figure 9. 2. STLmax values during a seizure. The column of stars marks the seizure occurrence 



be noted that the STLmax does not always show such obvious characteristics for 
all seizures and all electrodes, either because of inter-seizure dynamical differ- 
ences, or because of seizure specific parameters to estimate STLmax. The set 
of parameters to estimate STLmax was held fixed throughout our experiments 
and works very well for complex partial seizures evolving into secondarily gen- 
eralized seizures. Figure 9.3 to Figure 9.6 show examples of STLmax profiles 
for different seizure types. 

Figure 9.3 to Figure 9.6, reinforce the observation that there are commonali- 
ties (drop in STLmax) but also differences among each type of seizure. Hence, 
we can expect that the level of performance in detecting a seizure will vary 
depending on the seizure type. Another pertinent observation is the loss of 
temporal resolution in the STLmax profiles. As can be observed in these fig- 
ures, the seizure onset and seizure ending can only be determined with 10.24 
seconds precision, and the dynamic transition itself takes several samples, and 
it varies firom seizure to seizure. The same observation can be made for the 
transition to postictal state. These features imply that STLmax based detectors 
will provide delayed warnings in real-time applications. However, they may be 
more reliable than detectors applied to the ECoG directly. 

The data from patient 1 is used as the training set because of the diversity 
of seizure types, and data from patients 2-6 are used as the test set. Table 9.2 
describes the data and seizure statistics for the training and test sets. 
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Figure 9.3. Example EEG and STLmax signatures for partial secondarily generalized seizure 




Figure 9.4. Example EEG and STLmax signatures for complex partial seizure 
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Figure 9.5. Example EEG and STLmax signatures for subclinical seizure 
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Figure 9. 6. Example EEG and STLmax signatures for simple partial seizure 



Table 9.2. Distribution of data among training and testing sets. 





Training set 


Test set 




No. of samples 


% of data set 


No. of samples 


% of data set 


Data length 


74471 


19.7 


303365 


80.29 


No. of seizures 


23 


27.7 


60 


72.30 
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3. Detection Methods 

We will be comparing two basic detection methods that exploit the two char- 
acteristics of STLmax profiles during a seizure. Due to the space limitations we 
will be presenting briefly the methods, but will avoid discussions on parameter 
settings, which can be found in [6]. 

3.1. Threshold on STLmax profiles 

Observation of the STLmax profiles show that an epileptic seizure is related 
to a drop in STLmax. Hence, a simple detection scheme thresholds the STLmax 
and declares an event when the signal goes below the threshold. Since we have 
multiple channels of STLmax data, there are a number of possible combinations 
in which the multi-channel STLmax time series can be used. For example, the 
detection mechanism can operate on a subset of the channels, as follows: 

1 a drop in the minimum STLmax among the channels 

2 a drop in the average STLmax among the channels 

3 a drop in the maximum STLmax among the channels 

4 a drop in the median STLmax among a subset of chatmels 

5 a drop in the variance among the channels 

Alternatively, the detection can be done in each of the channels independently 
with a voting scheme, e.g. a drop in a critical mass of channels. Our hypothesis 
is that, for a seizure to happen, there might be a critical mass of brain areas that 
display a drop in the STLmax. 

3.2. Entrainment in STLmax and Seizure Detection 

The second characteristic observed before a seizure, is the co-variability of 
the STLmax profiles among channels and convergence to a small interval of val- 
ues, which we call the entrainment [4]. One possibility to measure entrainment 
would be the correlation coefficient, but this is a linear measure and it is limited 
to chaimel pairs. We decided to implement the Diks distance proposed recently 
to measure the difference between attractors in chaotic systems [1]. Using Diks 
distance one can create a statistic that yields the Diks test. A seizure will then 
be translated by a decrease in Diks distance just prior and during seizure. 

3.2.1 Brief introduction: Diks test and Diks distance. In simple 
words, Diks test [1] measures the distance between two multidimensional prob- 
ability distributions, constructed from the delay vectors, and normalized by its 
variance. 
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Consider the case of Ni vectors {X = [x(n), x{n — r), . . . , x{n — 

(m — 1 )t)]), with probability distribution pi(Xi), and N 2 vectors 

with probability distribution P 2 (^)- Realizations of these vectors are denoted 
by their lowercase analogs. The smoothed versions of the multidimensional 
distributions (p'^) are constructed as 

p'kC^) = J for k 6 {1, 2} (2) 

where K{~r, ”?) is the Gaussian kernel defined as 




where d > 0 is the bandwidth (kernel standard deviation). 

An imbiased estimator p'k{~r) of p't-i^) is 

1 

PfcC^) = 2 '^('r . ^) (4) 

^ i=l 

since the expected value of k( 1^ , ~s) is, 

j d(^) pi(^)«(r, Z) = p'i(^) (5) 

We know that the Euclidean distance between the distributions can be esti- 
mated as, 



Q = (2dv^)"' J d~r [pif^) - P 2 C^)]^ (6) 

More details about the properties of the measure are presented in [1]. It 
is claimed that the above distance becomes zero if and only if the two distri- 
butions are identical. By determining whether a consistent estimator of Q is 
significantly above zero we can test the null hypothesis pi = p 2 against all 
alternatives pi ^ P 2 - 

From the training set we selected an embedding dimension of 6, and a delay 
of 7 to reconstruct the attractors, and used 60 sample segments with 59 sample 
overlap firom the STLmax time series. The Gaussian kernel bandwidth was 
chosen as one. Figure 9.7 displays the Diks distance for a seizure, where it is 
well apparent that in fact the distances between all the chaimel pairs decrease 
towards and during this seizure. The delay between the trough and the time of 
seizure occurrence is corresponds to the number of samples needed to estimate 
the distance with reasonable accuracy. 
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Figure 9. 7. Diks distances for all the possible pair wise combination of electrodes during a 
seizure. 

4. Results and Discussion 

4.1. Definitions and Trade-offs 

In this chapter, we will present results of the two detection methodologies. 
However, first we have to specify the details of the definitions and validation. 
In order to compare different methods, the following clinical relevant receiver 
operating characteristics (ROC) criteria are used: 

1 Percentage of seizures detected. Along with it, we also enforce maximum 
amount of delay between the seizure onset and the instant the algorithm 
declares it as a seizure (Figure 9.8). 

2 Number of false alarms per hour. We restrict for clinical acceptability 
the number of false alarms at 1 per hour. However, to avoid multiple trig- 
gering in the same event we will enforce a “dead period” after detection. 

The following definitions are used to determine the above criteria: 

Hit - The alarm turns ON within the maximum allowed delay from the 
seizure onset. 

We still need to define when the alarm turns off. From Figure 9.9, we can 
see that this particular threshold will produce two alarms for the same seizure. 
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instantaneous spatial maximum 




Figure 9.8. Figure showing the delay between the instant of the actual event to the instant of 
the alarm. 




Figure 9.9. Subset of STLmax showing the same seizure producing two alarms. 
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Using our definitions above, we would count a true detection and a false alarm 
for this event, which is not right. 

We can avoid such situation by clustering the alarms. In the data set under 
study, the shortest interval between successive seizures is 18 minutes. There- 
fore, all alarms that occur within a span of 18 minutes can be clustered into a 
single detection (with the timing of the first alarm) without having to fear about 
one alarm representing more than a seizure. Implicitly we are constraining the 
resolution of the algorithm i.e., the minimum amount of time the algorithm 
requires between seizures to identify them separately. In this work, a conser- 
vative estimate of 15 minutes is used. Also if an alarm lasts for more than 15 
minutes, it is reset after 1 5 minutes and counted as more than one alarm. Table 
9.3 describes the mnemonics used in the ROC plots. It is to be noted that the 
median is applied on a subset of channels. The each instant. Therefore, it is 
presented separately as it has one extra variable N. 

4.2. Test Set Results 

We first present the ROC results as a function of how fast the alarm is turned 
on after the occurrence of the true event (which has been determined by trained 
neurologists). To improve the visibility of plots, only the best results for each 
subset of the detectors (STLmax, Diks distance) for each false alarm rate are 
shown. To make the comparisons easier, the axes are set to show detection rates 
firom 0% to 100% and the false alarm rates from 0.03/h to 1/h. From Figure 
9.10, we can see that STLmax works the best for fast detectors (delay of 1 
minute) for all the false alarm rates except near 1/h. However, the performance 
is still not acceptable with at most a detection rate of ~55% for the minimum of 
Diks distance. When the allowed delay in detection is increased to 5 minutes, 
not only the overall performance of all the detectors improve, but also the best 
performer becomes the minimum of Diks distance with a detection rate of 82%. 
We expected that Diks distance would be the best performer at intermediate 
lags, because it measures entrainment that starts occurring before the seizure. 
However, it requires a window of data to be estimated properly, and at 5 minutes, 
it performs better than subsets of STLmax for most cases. 



Table 9.3. Table of mnemonics. 



Signal 




Subset 








Minimum 


Average 


Maximum 


Variance 


STLmax 


minlmax 


meanlmax 


maxlmax 


Imaxvar 


Diks 


mindiks 


meandiks 


maxdiks 


diksvar 



% of seizures detected % of hits 



184 



QUANTITATIVE NEUROSCIENCE 



lOOr 
90 • 
80 ■ 
70 - 
60 - 
50 - 
40 - 
30 - 
20 
10 - 



minImax 

meanimax 

maximax 

Imaxvar 

mindiks 

meandiks 

maxdiks 

diksvar 



00^ 



max. delay=10 min 



t> l> 
V 



>> 

0> 



or> 

r>> 



OOO"' 

00 



D> 

I> 



oo 

+ 



< 

< 



max. delay=5 min 



/ 



max. delays 1 min 



10 



10 



# of false alarms/hour 



Figure 9.10. ROCs of subsets on the test set. 
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Figure 9.11. ROCs of median subsets on the test set. 
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For 10 minute delay between the seizure onset and the alarm, the best per- 
former is still the minimum of Diks distances and the figure of merit is rather 
high, 87% sensitivity for 1 false alarm per hour. This value seems far better 
than the reported performance of seizure detectors based on EEG [2, 3], but it 
requires a direct comparison in our data set. 

We next analyze the differences in performance among the multi-channel 
decision rules and the voting. The voting scheme performs a little poorer than 
their counterparts (around 2-5% less sensitivity (detection rate) for the same 
false alarm rate). The main reason is that, although there might be a critical 
mass mechanism operating during seizures, the amount of brain involved may 
be different from seizure to seizure. With respect to the median operator, it 
performed a little better in the case of STLmax, but it was poorer in the case of 
Diks distance. Further discussion will be omitted here. 

4.3. Case studies: Best Performance on the Test Set 

4 . 3.1 Quick Detector: STLmax. From Figure 9.10 to Figure 9.12, 
for short delays, we can see that the spatial median of STLmax and spatial 
minimum of Diks distance perform better than the others. Since the former 
has lesser computational complexity than the latter, it is chosen to be studied in 
terms of the seizure types, keeping the maximum delay allowed at 1 minute as 
well as the maximum false alarm rate at 1 per hour. 

Figure 9.13 presents the performance of the median operator for different 
values of N, the number of channels. We can see that a subset of 13 channels 
gives best performance; therefore, it will be analyzed in detail. From Figure 
9. 14, we can see that clinical seizures are detected better than sub-clinical ones. 
Also, we can see that the algorithm performs best for secondarily generalized 
complex partial seizures. 

4 . 3.2 Slow Detector: Spatial m inimum of Diks distance. From 
Figure 9.10 and Figure 9.12, for long delays, we can see that spatial minimum 
of Diks distance produces very good results (almost 90%). Analyzing it in 
detail, from Figure 9.15, we can see that it performs well for clinical seizures 
(both generalized and local). One of the reasons for this phenomenon may be 
due to the fact that the training set consists mostly of clinical seizures. 

5. Conclusion and Directions for Future Work 

In this work, two different detection methods are studied each focusing on 
one of the characteristics of the STLmax signature of a seizure. The results are 
summarized in Table 9.4. 

For quick response detectors, the STLmax thresholding performed better 
than Diks distance. This is expected due to the inherent delay in the computa- 



% of seizures detected % of seizures detected 
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Comparison of ROCs with alarm duration-15 min. on the test set 



90 ■ 
80 ■ 
70 - 
60 - 
50 - 
40 - 
30 - 
20 - 
10 - 



0 STLmax 1 min 
O Diks 1 min 
A STLmax 5 min 
+ Diks 5 min 
'fr STLmax 10 min 
* Diks 10 min 















10 



# of false alarms/hour 



10 “ 



Figure 9. 12. ROCs of voting on the test set. 
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Figure 9.13. ROC of median subsets for a maximum allowable delay of 1 minute on the test 
set. 



Epileptic Seizure Detection Using Dynamical Preprocessing (STLmax) 



187 



ROC of different seizure types for spatial median of STLmax 




Figure 9. 14. ROC of different seizure types using spatial median of STLmax for a maximum 
allowable delay of 1 minute on the test set. 



Table 9. 4. Best detection rates on test set for false alarm rate <\/h (ordered by result). 



Delay 

1 5 10 



Median STLmax (58.33) 
Min. Diks (58.33) 
Median Diks (54.07) 
Min. STLmax (53.33) 
Mean STLmax (53.33) 
Diks vote (50.00) 
Max. STLmax(41.67) 
STLmax vote (36.67) 
Var. STLmax (36.67) 
Mean Diks (28.33) 
Max. Diks (28.33) 
Var. Diks (26.67) 



Min. Diks (81.67) 
Median Diks (78.52) 
STLmax vote (76.67) 
Median STLmax (65.00) 
Min. STLmax (63.33) 
Mean STLmax (60.00) 
Max. Diks (60.00) 
Var. Diks (58.33) 
Mean Diks (56.67) 
Diks vote (50.00) 
Max. STLmax (43.33) 
Var. STLmax (36.67) 



Min. Diks (86.67) 
Median Diks (80.19) 
STLmax vote (78.33) 
Median STLmax (71.67) 
Var. Diks (70.00) 
Mean Diks (65.00) 
Min. STLmax (65.00) 
Max. Diks (61.67) 
Mean STLmax (60.00) 
Diks vote (53.33) 
Max. STLmax (43.33) 
Var. STLmax (36.67) 
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ROC of different seizure types for spatial minimum of Diks 




Figure 9.15. ROC of minimum Diks distance for a maximum allowable delay of 1 minute on 
the test set. 



tion of Diks distance. Also, subsets of STLmax performed better than others 
in the case of secondarily generalized seizures. The minimiun of Diks distance 
gave a good generalization for all types of seizures under study. Due to the 
simplicity of the computation, unprocessed STLmax is ideally suited for real 
time implementation. Overall, we think that the performance in terms of de- 
tectability (87%) for one false alarm per hour makes STLmax a very interesting 
preprocessor for automated seizure detection in long term monitoring of epilep- 
tic patients. Obviously, the downside is the delay between seizure occurrence 
and alarms, which must be at least 5 minutes. At this point it precludes the 
use of these detectors for real time applications such as drug intervention or 
warning. However, if the problem is one of transient response, we can overlap 
the windows for STLmax estimation and decrease the amoimt of time the Diks 
distance need to converge to their final values among preictal and ictal events. 
More research needs to be conducted in this aspect. 

All the parameters were foimd using the training set, and of course, they 
may not be the optimal ones for the test set. Also, the use of Diks distance is 
based on the assumption of entrainment between electrodes, which may not be 
true for all types of seizures. This reasoning means that we may have to adapt 
these parameters for each patient to improve performance. Another aspect that 
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requires further analysis is the best combination of all these detectors. Due to 
the sheer amount of parameters, a genetic search algorithm may be required. 
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Introduction 

Neglect is a complex and devastating human neuropsychological disorder 
characterized by a failure to attend to novel or meaningfiil stimuli presented to 
the side contralateral to a brain lesion, in the absence of a primary sensory or 
motor dysfunction. Some manifestation of neglect is found in approximately 
40% of all cases of brain damage, although more frequently following right 
hemisphere damage. The vast majority (80-90%) of all cases of neglect are 
produced by destruction of one of three cortical regions: the dorsolateral pre- 
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frontal cortex, the cingulate cortex, or the parietotemporal junction [Heilman 
etal., 1993,Kamathetal., 2001,Mesulam, 1980,Mesulam, 1990]. The neglect 
syndrome goes far beyond a lack of responsiveness to contralesional stimuli 
to include dramatic attentional and cognitive spatial deficits. These include 
several major associated disorders. Patients may exhibit hemispatial neglect in 
which the patient does not have a cognitive representation of the contralesional 
hemispace, and will omit details of an imagined scene. Hemiinattention to 
contralesional stimulation across several or one sensory modality is also quite 
common. The patients may display hemiakinesia, or a paucity of movement 
of the contralesional limbs. Inappropriate orientations (allesthesia) may also 
occur in which the patient will respond or orient to contralesional stimulation 
as if it came from the ipsilesional side. Extinction to bilateral simultaneous 
stimulation may also be exhibited. Often during the course of recovery from 
severe neglect, when the patient is stimulated on the contralesional side alone 
they may respond to the stimulation, but if stimulated bilaterally they may only 
demonstrate awareness of the ipsilesional stimulus. Further, these patients may 
exhibit anosognosia, a lack of awareness of their current neurological state, 
or anosodiaphoria in which they are unconcerned about their neurological sta- 
tus [Heilman et al., 1993]. 

As a consequence, patients may fail to recognize the limbs on the con- 
tralesional side as their own, eat off of one side of a plate, and groom only 
their ipsilesional side. The cognitive, affective, and motor disturbances that 
comprise the neglect syndrome are often debilitating to the patients and their 
families [Robertson et al., 1993]. The failure to respond to contralesional stim- 
uli has devastating effects for the patients’ abilities to demonstrate independent 
daily living or to return to work, and the presence of neglect in patients has been 
foimd to be the best predictor of a poor prognosis for recovery [Denes et al., 
1982, Fullerton et al., 1997,Kerkhoff, 2001]. Further, neglect has been shown 
to be related to poor recovery from related deficits such as hemiplegia [Denes 
etal., 1982]. 

Recovery, when it occurs, is spontaneous over the course of weeks to months, 
but often it is incomplete. Many neglect patients, perhaps as high as 15-25% 
[Kerkhoff, 200 1 ] , continue to ignore or neglect contralesional stimuli for months 
or years post lesion [Halligan and Marshall, 1 994, Henley et al. , 1 985 , Kerkhoff, 
2001]. They are often unaware of their neurological status (anosognosia), or 
are unconcerned and affectively flat (anosodiaphoria) [Heilman et al., 1993]. 
The lack of awareness and changes in motivation interfere with successful 
physical and occupational rehabilitation [Robertson et al., 1993]. Behavioral 
treatments rarely generalize outside of the therapeutic context or across tasks 
within the same therapeutic context [Couvier et al., 1987, Robertson et al., 
1993]. Drug therapies are rarely used to treat neglect because there has been 
no rational framework for imderstanding the mechanisms which might lead to 
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a therapeutic effect. Typically, drugs have been given only to patients with 
chronic neglect with stable behavioral baselines, because of the concern that 
drug effects may interfere with ongoing recovery [Fleet et al., 1987,Geminiani 
et al., 1998,Hurford et al., 1998]. 

1. The Rodent Model of Neglect and Recovery 

The poor prognosis and the absence of generally accepted therapies for ne- 
glect have led to the development of animal models of neglect [Milner, 1987]. 
We have developed a rat model of neglect to examine the basic mechanisms 
of neglect, and the potential for recovery of function. A rodent model offers 
several practical advantages in terms of accessibility and cost over many of the 
primate models, and we have found striking behavioral, pharmacological and 
anatomical similarities between the systems related to neglect in rodents and 
those in primates [Corwin and Keep, 1998]. 

We [Burcham et al., 1997, Corwin and Keep, 1998] have found that in rats, 
as in primates [Heilman et al., 1993], there is a cortical network for directed 
attention, involving the medial agranular (AGm), posterior parietal (PPC), and 
ventrolateral orbital cortices. The symptoms of neglect in rodents are similar 
to those found in human neglect patients including: severe neglect of visual, 
tactile, and auditory stimuli presented in the hemispace contralateral to the 
lesion, extinction, allesthesia, and disorders of spatial processing [King and 
Corwin, 1992, King and Corwin, 1993, Corwin and Keep, 1998]. Severe ne- 
glect is defined as the contralesional side being only 33% as responsive as the 
ipsilesional side. The tests for neglect and extinction in rodents were designed 
to mirror the bedside testing of neglect patients [Crowne et al., 1983, Corwin 
et al., 1986, Van Vleet et al., 2000, Van Vleet et al., 2002]. Selective disconnec- 
tion of the corticocortical axons linking AGm and PPC in the absence of direct 
damage to either area results in neglect, demonstrating that it is the integrity of 
the network that is critical for normal directed attention [Burcham et al., 1997]. 

Studies of recovery from neglect in rats have focused mainly on area AGm, 
and the results indicate a number of strong similarities to recovery in humans. 
In rats recovery from AGm-induced neglect has been found to occur in three 
contexts : ( 1 ) as found in humans, rodents with neglect demonstrate severe long- 
term deficits with some limited spontaneous recovery which may occur over the 
course of weeks to months [Corwin et al., 1986,King and Corwin, 1990,King 
and Corwin, 1993], (2) as in humans dopamine (DA) agonists produce acute 
recovery of function [Corwin et al., 1986, Fleet et al., 1987,Geminiani et al., 
1998, Hurford et al., 1998, King md Corwin, 1990, Van Vleet et al., 2003b], 
and (3) exposure to 48 hr of light deprivation (LD) at 4 hr postsurgery produces 
dramatic immediate recovery [Burcham and Corwin, 1998,Crowne et al., 1983, 
Van Vleet et al., 2003a]. 
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2. Role of the Dorsocentral Striatum in Neglect 
2.1. Behavioral Studies 

While it was documented that recovery does occur in the three contexts men- 
tioned above, the crucial systems involved in recovery and the neural mechanism 
remained unknown. However, studies in rodents and humans indicated that the 
striatum may play a crucial role in neglect and recovery from neglect produced 
by imilateral AGm lesions. 

The striatum receives glutamatergic inputs from cortex and thalamus, and 
dopaminergic inputs from substantia nigra (SN) that converge on the same 
medium spiny neurons [Parent, 1990]. Neglect can be produced by corti- 
cal lesions, or disruption of the nigrostriatal DA system [Marshall and Got- 
thelf, 1979], or by striatal infarcts [Caplan et al., 1990]. The importance of 
the striatum in AGm lesion-induced neglect was originally suspected because 
systemic delivery of apomorphine (a DA receptor agonist) can produce acute 
recovery [Corwin et al., 1986, King and Corwin, 1990], and spiroperidol (a 
DA receptor antagonist) can reinstate neglect [Vargo et al., 1989]. Based on 
our findings and those of Marshall on subcortical neglect in rats. Fleet et al. 
[Fleet et al., 1987], Geminiani et al. [Geminiani et al., 1998], and Hurford et 
al. [Hurford et al., 1998] examined the effects of DA agonists (bromocriptine 
or apomorphine) in human patients with chronic neglect and found that DA 
agonists produced significant acute recovery. While DA agonist treatment is 
effective, the site of action for the therapeutic effects was unknown. 

Vargo and Marshall [1996a; 1996b] were the first to suggest that plastic 
changes in the striatum were correlated with neglect and behavioral recovery 
from AGm-induced neglect. They found that neglect was correlated with de- 
creases in NMDA and kainate receptors in the ipsilesional dorsolateral striatum, 
and recovery was correlated with a normalization of kainate receptors and a 1 0% 
increase in NMDA receptors in this region. The basis for the changes in gluta- 
mate receptors was unknown, and could be based on changes in presynaptic or 
postsynaptic receptors [Chen and Hillman, 1990]. 

Based on these findings we began a series of behavioral, pharmacological, 
and anatomical studies to examine the role of the striatum in neglect and be- 
havioral recovery of function. Based on our anatomical findings [Reep and 
Corwin, 1999] the major site for the projections of the AGm was a region in 
the dorsocentral striatum (DCS) (Figiure 10. 1), therefore the initial studies were 
designed to examine the potential role of this region. 

In the initial study of the DCS we directly compared the behavioral effects 
of unilateral lesions of the AGm or the DCS on the major manifestations of 
the neglect syndrome. The results of this study indicated that axon-sparing 
unilateral DCS lesions resulted in severe multimodal neglect (Figures 10.2 and 
10.3) which did not spontaneously recover even after 96 days of testing, thus 
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Figure 10.1. In this coronal section through the forebrain, labeled axons from cortical area 
AGm terminate most densely in the dorsocentral region of the striatum (arrow). There is also a 
substantial projection to the margin of the striatum bordering the external capsule. Thick profiles 
represent axon fascicles. 



supporting the contention that the DCS is a critical component of the network 
for neglect and directed attention [Van Vleet et al., 2000]. However, unlike 
AGm lesions, unilateral DCS lesions did not produce extinction or allesthesia 
[Van Vleet et al., 2002] (Figure 10.4). The results were the first to indicate that 
neglect and extinction are experimentally dissociable. Further, the results indi- 
cated that extinction deficits may be based on a disruption of cortical systems 
related to directed attention. It is of some interest that neither AGm or DCS - 
induced neglect were correlated with increases in circling behavior which in- 
dicates that the deficits in orientation were not secondary to postural or motor 
asymmetries. These results are of some import because they suggest that the 
deficits associated with the neglect syndrome are multiple. However, to the 
degree they are dissociable it may be possible to target prospective behavioral 
or pharmacological treatments so as to deal with specific individual deficits 
[Pyter et al., 2002]. These considerations are clinically relevant because ex- 
tinction deficits often persist in patients that have recovered spontaneously from 
neglect [Kerkhoff, 2001]. 
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DCS LSC 





Figure 10.2. Axon-sparing lesions of the dorsocentral striatum, made using NMD A. Maximum 
(outlined) and minimum (stippled) extents of the lesions are shown at two a-p levels for the DCS 
and lateral striatal control (LSC) groups. 



In a subsequent studies the role of the DCS in acute drug-induced recovery 
was examined. The results of these studies indicated that the likely site of 
action for the therapeutic effects of apomorphine on neglect is the DCS. Van 
Vleet et al. [Van Vleet et al., 2000] found that apomorphine was ineffective 
in subjects with neglect induced by unilateral destruction of the DCS. These 
findings led to the conclusion that the integrity of the DCS was necessary for 
the therapeutic effects of apomorphine. In order to examine the role of the 
DCS as the crucial site for the therapeutic effects of apomorphine we infused 
apomorphine directly into the DCS or a laterally adjacent area not implicated in 
the circuitry related to neglect. Apomorphine infusion into the DCS produced 
dramatic dose-dependent recovery from multimodal neglect [Van Vleet et al., 
2003b]. Infusion into a laterally adjacent area in the striatum did not produce 
a therapeutic effect, even when much higher dosages were used (Figure 10.3). 
The drug-induced recovery was virtually identical to that found in prior studies 
which used systemic administration of apomorphine [Corwin et al., 1986,King 
and Corwin, 1990]. We have also examined the effects of apomorphine on 
extinction [Pyter et al., 2002]. We have found that, in keeping with previous 
studies in rats [Corwin et al., 1986,King and Corwin, 1990] and humans [Fleet 
et al., 1987,Geminiani et al., 1998,Hurford et al., 1998] dopamine agonists pro- 
duce a therapeutic effect on severe multimodal neglect. However, apomorphine 
does not produce a therapeutic effect on extinction in these same subjects. 
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Figure 10.3. Neglect ratios for the DCS-lesioned group, vehicle control DCS group, and LSC 
lesion control group. A neglect ratio of 1 .0 represents normal, symmetrical responding to stimuli 
presented on the left or right side; ratios < 0.4 represent severe neglect. Data are presented for 
each modality separately as well as in a total neglect ratio that combines the data from all 
modalities. Stars represent significant differences on a weekly basis. The DCS-lesioned group 
had a significantly (P < 0.05) lower total neglect ratio than the other two groups when data 
were collapsed across weeks. 



Recovery from AGm-induced neglect can be produced by exposing the AGm 
operates to 48 hr of light deprivation within 4 hr of the lesion. The results of 
a recent study from our group have indicated that the DCS is the likely site of 
action for the therapeutic effects of light deprivation [Van Vleet et al., 2003a]. 
In that study, subjects with combined unilateral AGm and DCS lesions did 
not demonstrate light deprivation-induced recovery, while subjects with AGm 
lesions combined with a lesion of the striatum laterally adjacent to the DCS did 
demonstrate dramatic recovery. In this study we also found that the therapeutic 
effects of light deprivation were limited to neglect; a severe extinction deficit 
was imaffected by light deprivation. These results further support the contention 
that the neural substrates for neglect and extinction are dissociable and that the 
substrates which underlie recovery are likely to differ as well. 

The results of the behavioral studies provide strong evidence that the DCS is 
an crucial component of the circuitry related to directed attention and neglect. 
Further, the integrity of the DCS is crucial for recovery from neglect in all three 
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Figure 10.4. DCS-lesionedrats exhibitno extinction, in contrast to AGm-lesioned rats. Shown 
are mean frequencies of contacting the contralateral tab first in the AGm and DCS groups. 
Maximum number of potential removals is five. Stars indicate significant difference (P < 0.05) 
from pretest. 



of the contexts in which behavioral recovery has been obtained in the rodent 
model. 

The dissociation between neglect and extinction in these studies is consistent 
with recent findings in the literature on human neglect patients [Stone et al., 
1998, Vallar et al., 1994]. The results strongly suggest that different pharmaco- 
logical and enviroiunental interventions may be required to promote recovery 
for the constellation of deficits that comprise the neglect syndrome. These find- 
ings are particularly important because in many “recovered” neglect patients 
extinction deficits remain [Kerkhoff, 2001]. These results have clear clinical 
implications, and they indicate that adequate treatment of patients with neglect 
and extinction may be pharmacologically complex, and that specific compo- 
nents of the "neglect syndrome" may have to be specifically addressed. This 
very issue was one of the highlights of a recent review of the literature on 
neglect [Kerkhoff, 2001]. 

2.2. Anatomical Studies 

The anatomical connectivity of DCS has been examined in several studies 
by our group, beginning with the finding that DCS receives a prominent input 
firom cortical area AGm (Figure 10.5). The anatomical studies have provided a 
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foundation for quantitative assessment of dynamic changes in the DCS that are 
correlated with behavioral recovery. 

Cortical projections to the DCS were investigated by injection of retrograde 
fluorescent axonal tracers into the DCS [Cheatwood et al, 2003]. A key find- 
ing of this study was that in addition to its main input from cortical area AGm, 
DCS also receives substantial input from the multimodal posterior parietal cor- 
tex (PPC). This is significant because PPC and AGm are linked by corticocor- 
tical connections [Vandevelde et al., 1996] and are both critical components 
of the circuitry involved in spatial processing and directed attention. Other 
cortical areas providing input to DCS include visual association area Oc2M, 
lateral agranular cortex (AGl), and orbital cortex (VLO, LO). These areas have 
reciprocal connections with AGm and PPC. Inconsistent labeling was seen in 
somatic sensorimotor areas FL, HL and Par 1 . 

Thalamic afferents to DCS were foiuid to be prominent from LD, LP, MD, 
VL and the intralaminar nuclei. Collectively, these nuclei constitute the sources 
of thalamic input to cortical areas AGm and PPC. We found evidence of topog- 
raphy in the thalamic projections to DCS from LD and LP, and the observed 
pattern is consistent with the known organization of thalamocortical projec- 
tions involving these nuclei and the topography of corticostriatal projections 
fi'om their cortical targets. Injections in dorsal DCS consistently labeled tha- 
lamic nuclei LD and LP. Together with previous data on thalamocortical and 
corticocortical coimections, this finding suggests that dorsal DCS is preferen- 
tially related to caudal AGm. 

In a subsequent study [Reep et al., 2003] we utilized the anterograde ax- 
onal tracer BDA to delineate the pattern of corticostriatal terminations in DCS 
originating from areas AGm, PPC, Oc2M, AGl, and the orbital cortex. These 
findings revealed that the projection from AGm is prominent within DCS and the 
main corticostriatal projections from areas other than AGm are situated around 
the periphery of DCS in a roughly circular fashion: visual association cortex 
dorsomedially, PPC dorsally, AGl laterally, and orbital cortex ventrally (Fig- 
ure 10.6). Each of these cortical projections is also represented by less dense 
aggregates of terminal labeling within DCS. Double anterograde labeling us- 
ing fluorescent dextran tracers provided direct confirmation that corticostriatal 
axons from AGm and PPC (and from AGm and Oc2M) overlap and interdig- 
itate at foci within DCS, often forming terminal fields within close proximity 
to one another. This relationship suggests that axons from AGm and PPC may 
be making synaptic coimections on the same individual striatal neurons. We 
are now investigating this possibility using anterograde tracing coupled with 
electron microscopy. 

An important discovery from the anatomical experiments was that DCS re- 
ceives input from several cortical areas which are themselves interconnected. 
This indicates a selective convergence within DCS of projections from inter- 
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Pretest Apo test Post test 

Figure 10.5. Neglect ratios for four groups with unilateral lesions of AGm, given different 
dosages of apomorphine or vehicle infused into DCS. A ratio of 0.0 represents symmetrical 
responding, and all groups exhibited severe neglect in the pretest phase. The star indicates that 
the high dosage group exhibited significantly more symmetrical responding than all other groups 
except the medium dosage group. 




Figure 1 0. 6. Schematic summary of the topography of corticostriatal projections in the vicinity 
of DCS. Each ellipse represents the general location of the main projection field from the indicated 
cortical area. DCS constitutes a central core whose perimeter is outlined by the dense projections 
from several cortical areas. 
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connected cortical areas, and represents support for the contention that DCS 
is a core associative region of the dorsal striatum. An especially interesting 
example is that projections from AGm and PPC converge in DCS. These two 
cortical areas are linked by extensive corticocortical connections [Corwin and 
Keep, 1998]. When these axons are selectively severed without directly dam- 
aging PPC or AGm, neglect occurs [Burcham et al., 1997]. Taken together 
with the behavioral and pharmacological data above, it appears that correlated 
convergent input to DCS from AGm and PPC may be necessary for normal 
directed attention and neglect. 

The results of the anatomical studies to date indicate that DCS receives inputs 
from cortical and thalamic areas that are themselves linked by corticocortical 
and thalamocortical connections (Figure 10.7). Together, the findings of the 
behavioral and anatomical studies support the view that DCS is an integral 
component of an associative network of cortical, striatal, and thalamic regions 
involved in directed attention, is the site of multimodal integration of spatial 
stimuli, and is the critical substrate for recovery from neglect produced by 
cortical lesions. 

3. Role of induced plasticity in DCS in Recovery from 
AGm-induced Neglect 

In a number of recent studies the search for factors to promote plasticity 
and recovery of fimction from damage to the CNS has focused on the role of 
extrinsic factors in the neural environment that might be inhibitory for neurite 
outgrowth and plasticity [Brittis and Flannagan, 2001 , Homer and Gage, 2000]. 
The results of these studies have indicated that plasticity can be induced follow- 
ing CNS injury by inhibition of the glial inhibitory molecule Nogo through the 
use of treatment with anti-Nogo antibodies such as IN-1 [Bandtlow and Schwab, 
2000, Chen et al., 2000], and the identification of receptors that can mediate 
Nogo activity [Fournier et al., 2001, Liu et al., 2002]. IN-1 treatment has been 
found to produce recovery and plasticity in several contexts including regen- 
eration of corticospinal tracts in cats [Schnell and Schwab, 1990, Schnell and 
Schwab, 1993, Schnell et al., 1994], functional recovery of locomotor functions 
[Bregman et al., 1995], and in a forelimb reaching task [Papadopoulos et al., 
2002 ]. 

Our behavioral findings indicate that the striatal projection zone of the AGm 
in the dorsocentral striatum (DCS) is crucial for recovery of function from 
severe neglect following unilateral destraction of the AGm. It remains to be 
determined whether dynamic changes in the DCS are systematically related to 
recovery from AGm-induced neglect. In order to examine this issue we have 
embarked on a series of studies designed to determined whether IN- 1 will induce 
plasticity in the DCS which would lead to subsequent behavioral recovery. 
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Table 10.1. BDA-labeled axon density in DCS at the septal level (~ +0.7 ap) resulting from 
injections of BDA in cortical area AGm on the right side. 





Normal 


Control hybridoma 


IN-1 hybridoma 




AGm 


(left AGm lesioned) 


(left AGm lesioned) 


Case numbers 


108, 109, 112 


H300, 355, 364 


H302, 346, 347 


Contra/Ipsi Axon 
density ratio 


0.33-0.42 


0.35-0.47 


0.85-1.31 



In our initial study, we examined whether administration of IN-1 would 
induce plasticity in the DCS following unilateral lesion of cortical area AGm. 
Given our prior findings concerning the cmcial role of the DCS in recovery 
from neglect, we felt that such plasticity would have positive implications for 
recovery from neglect and the potential role of the input from contralesional 
AGm and ipsilesional PPC in recovery. Therefore, we examined whether IN-1 
can induce neural growth in the afferents to the DCS from the contralesional 
AGm and the ipsilesional PPC, following unilateral lesion of AGm, and whether 
this plasticity is systematically related to recovery from severe AGm-induced 
neglect. 

Prior studies of corticostriatal plasticity have focused on the input from the 
contralesional cortex [Carmichael and Chesselet, 2002,Kartje et al., 1999,Pa- 
padopoulos et al., 2002]. The monoclonal antibody IN-1 successfully induces 
sprouting of these projections in the form of increased axon density, and this 
plasticity is associated with functional recovery [Kartje et al., 1999,Papadopou- 
los et al., 2002]. However, sprouting may also be considered to include growth 
cone formation and synaptogenesis. For the AGm, the ipsilateral projection 
to the DCS in normal animals is approximately three times denser than the 
contralateral (Table 10.1). When the left AGm is lesioned to produce ne- 
glect, significant denervation of left DCS occurs. Does the input from right 
AGm to left DCS then sprout to occupy this vacated synaptic space (Figure 
10.8)? If so, there should be an associated increase in axon density from the 
contralateral AGm. This change would be reflected as an increase in the con- 
tralateral/ipsilateral (left/right) ratio of axon density in DCS originating from 
the right AGm. This can be assessed by depositing an anterograde axonal tracer 
in right AGm and measuring axon density in left and right DCS [Papadopoulos 
et al, 2002]. 

3.1. Sprouting from contralesional AGm 

A series of subjects (N=9) received lesions of left AGm which were accom- 
panied by injection of 5jA of suspended hybridoma cells producing either the 
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MD,VM LD,LP 



Figure 1 0. 7. Major pattern of connections involving DCS. Thalamic regions proj ecting to DCS 
also project to cortical areas AGm and PPC, which converge in DCS. 



Normal projection from 
AGm to dorsocentral striatum 
left AGm right AGm 




Sprouting from contra lesional AGm 
to ipsi lesional DCS ? 
left AGm right AGm 




Figure 10.8. Schematic representation of the normal corticostriatal projections from AGm to 
dorsocentral striatum, and the hypothesized sprouting that occurs following unilateral lesion of 
AGm and IN-1 treatment. 
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IN-1 antibody or a control antibody (HRP). The subjects were tested for neglect 
for 4-7 weeks, and we then injected BDA in the right (contralesional) AGm and 
assessed sprouting by measuring axon density in DCS. In three normal control 
animals with injections of BDA in rostral, mid or caudal AGm, the axon den- 
sity contralateral to the injection was less than half the density ipsilateral to the 
injection, as reflected in contralateral/ipsilateral ratios less than 0.5 (Table 1; 
Figure 10.9). Control hybridoma cases exhibited similar ratios. However, in 
IN-1 experimental cases that had recovered from severe neglect, the contralat- 
eral density increased by over 100%, resulting in densities equivalent to those 
ipsilaterally, and thus ratios of 1 .0 (Table 1 ; see Figure 10.2). This preliminary 
evidence indicates that subjects that were given IN-1 demonstrated behavioral 
recovery and axonal sprouting in the left DCS from axons originating in con- 
tralesional AGm. Thus, the data suggest that IN-1 induces axonal plasticity 
in treated subjects, and that this plasticity may be the cause of the recovery 
from severe neglect. We found that the changes in density were not based on 
shrinkage of the DCS subsequent to the AGm lesion, which could have resulted 
in an apparent increase in density. Neither the DCS nor the striatum as a whole 
exhibited any shrinkage. These findings support those of Papadopoulos et al. 
[Papadopoulos et al., 2002] and Kartje et al. [Kartje et al., 1999] using IN-1, 
and Carmichael and Chesselet [Carmichael and Chesselet, 2002] using ther- 
mocoagulatory lesions, which demonstrate corticostriatal plasticity firom the 
contralesional motor cortex. Our results suggest that those findings extend to 
corticostriatal inputs fi"om association cortex, and that plasticity following IN-1 
treatment may produce recovery from severe neglect. Moreover, the plasticity 
appears to be specific to areas that project to the DCS. In one case, after a lesion 
of left AGm, BDA was injected into the right anterior cingulate cortex instead 
of right AGm. This case exhibited no increase in axon density in left DCS 
compared to a control (contra/ipsi ratios = 0.30 and 0.35), suggesting that not 
all cortical areas projecting to DCS are competent to sprout into it. 

In previous studies that have examined sprouting and its relationship to motor 
recovery, typically only contralesional inputs have been examined [Kartje et al., 
1999, Papadopoulos et al., 2002]. In the case of neglect we have demonstrated 
that the AGm and the ipsilesional PPC fimction as a system [Burcham et al., 
1998]. Destruction of either area alone or transection of connections between 
the two produce neglect. Given these relationships, we examined the potential 
for sprouting of the PPC projections to the DCS in two subjects. In an exper- 
iment similar to those described above, BDA was injected into the left PPC 
following lesion of the left AGm and subsequent treatment with IN-1. Nor- 
mally, the projection from PPC to ipsilateral DCS is less than twice the density 
of the contralateral projection (ipsi/contra ratio = 1.78). However, after IN-1 
treatment there was a large (60%) increase in this ratio, to 2.86. The change in 
density was not explainable by shrinkage of the DCS which could lead to an 
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Figure 10.9. All images depict BDA-labeled axons in the dorsocentral striatum (DCS). For 
each brain (cases 108 and H302) four panels are shown, depicting ipsilateral and contralateral 
axonal labeling after an injection of BDA into cortical area AGm. Fascicles are denoted by “f 
For each brain the top row represents normalized, contrast-enhanced images, and the bottom 
row represents extracted (high pass filtered) versions of the same images. The extracted images 
were used for density measurements. In normal rats like case 108 there is denser axonal labeling 
in the ipsilateral DCS than contralaterally. In experimental cases like H302 there was a shift so 
that the contralateral side exhibited axon density equivalent to that on the ipsilateral side. 
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apparent change in density. These preliminary findings are strongly suggestive 
that sprouting occurred in the PPC projections into the DCS. 

The results of these preliminary studies strongly suggest that robust sprouting 
effects can be produced in this system in the presence of the IN-1 antibody. 
Further, these results suggest that cortical inputs to DCS other than the crossed 
input fi'om AGm may also exhibit sprouting. A related issue which may speak 
to the potential clinical efficacy of IN- 1 concerns the specificity of the plasticity. 
We have found plasticity in the ipsi and contralesional inputs to the DCS. Is 
the plasticity specific to DCS or is it also found in other regions of the dorsal 
striatum? Perhaps IN-1 treatment produces uncontrolled nonspecific plasticity 
in the striatum. We have preliminary data from one subject in which the subject 
received an AGm lesion and was treated with IN-1 . After behavioral recovery 
BDA was injected into the contralesional cingulate cortex which projects to an 
adjacent region of striatum, but not to the DCS. The results indicated that the 
ratio of ipsi/contralateral inputs from the cingulate cortex into the striatum is 
unchanged despite the presence of IN-1. Also, as mentioned above, there was 
no evidence of sprouting from the cingulate cortex into the DCS even though 
the main striatal projection zone of the cingulate cortex is medially adjacent to 
the DCS. 

4. Clinical Significance 

Neglect is a severe and prevalent clinical disorder, and at present there are 
no generally accepted therapies for the treatment of neglect. We have devel- 
oped a rodent model to study neglect and behavioral recovery, and these rats 
exhibit many of the same fundamental types of deficits found in human pa- 
tients with neglect. The evidence discussed above, including our preliminary 
findings, strongly suggests that the DCS plays a crucial role in neglect induced 
by cortical lesions, and that DCS may be the crucial site for the mechanisms 
leading to recovery. Our anatomical findings indicate that the DCS receives 
converging inputs from the AGm and the PPC, and that the DCS might cor- 
rectly be considered to be associative striatum [Cheatwood et al., 2003, Keep 
et al., 2003]. These anatomical findings have been the foundation for our most 
dramatic recent finding that extensive plasticity can be induced in the DCS 
afferents from the contralesional AGm and the ipsilesional PPC using IN-1, 
and that recovery from neglect is correlated with this plasticity. These findings 
suggest that induced axonal sprouting and synaptogenesis in the DCS may un- 
derlie recovery. Prior studies using EM-l have focused on recovery from motor 
deficits [Papadopoulos et al., 2002]. In the present proposed studies we extend 
the potential use of growth-inducing factors into the treatment of neglect, a cog- 
nitive deficit induced by brain damage. What is most exciting and the essence 
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of these experiments is that if plasticity can be induced there might eventually 

be a significant lasting treatment or a cure for this cognitive disorder in humans. 
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Introduction 

Visual evoked potentials (VEPs) to checkerboard stimuli are often recorded 
to evaluate the ftmctional visual nervous system from the eye to brain. For ex- 
ample, various types of VEPs have been used for the early detection of multiple 
sclerosis (MS) [Harter, 1970,Chiappa, 1983, Andersson and Siden, 1991, Towle 
etal., 1991, Hood and Zhang, 2000]. MS is a central nervous system (CNS) dis- 
ease, characterized by multiple areas of demyelination [Robinson and Rudge, 
1977, Waxman, 1983]. People suffering from MS often develop optic neuritis 
(ON) as well. 

We recorded the multifocal VEPs (mVEPs) from the Normal and MS subjects 
using multifocal stimuli, having four degrees of temporal sparseness: Binary, 
Sparse 4 , Sparseie and Pattern Pulse [James, 2003]. We examined response 
waveforms and latencies at different densities types of visual stimuli to find 
out which one produced the significant responses and the largest delays in MS 
subjects. 
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1. Methods 
1.1. Stimuli 



The description of VEP stimuli is given in our earlier study [Maddess et al., 
2003]. Subjects viewed the monitor from a distance of 30 cm providing the 
stimulus layout as illustrated in Figure 11.1. The monitor was divided into eight 
regions, and red fixation spot was presented at the screen’s centre. 



Figure 11.1. Sample of the visual stimulus. In practice the checks had contrasts 1 or 0 (i.e. 
black, white or grey). The numbers (1 8) indicate the eight different regions. 



The examples of four types of temporal sequences are presented in Fig- 
ure 11.2. 

1.2. Subjects 

The MS study group contained 50 subjects (eight men and 42 women, age 
range 25 to 64 (45 ± 15.2 year). 26 subjects suffered from ON. The Normal 
study group contained 19 subjects (12 men and seven women, age range 22 
to 44 (31.2 ± 13.1 year), with normal or corrected to normal refraction). The 
summarized subjects’ data are presented in Table 11.1. 

The research followed the tenets of the Declaration of Helsinki, under the 
Australian National University’s Human Experimentation Ethics Committee 
under protocol M9901. Informed written consent was obtained from the sub- 
jects after the nature and possible consequences of the study were explained to 
them. 



Visual Stimulus Layout 
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Figure 11.2. Example of the temporal modulation of a single region, a) Temporal stimulus 
density for the first Binary sparseness (probability of \ that check contrast takes the values —1 
or 1). b) Sparse 4 stimulus, when the probability of a checkerboard appears in one sign of the 
other was \ ; the check contrast takes the values {—1, 0, 1}). c) Sparse 16 when the probability 
of a checkerboard appears in one sign of the other is ^ . The stimulus is ternary; the check 
contrast taking the values {—1, 0, 1}. d) Pattern Pulse, when the probability of a checkerboard 
appears in one sign of the other is ^ . The stimulus is ternary; check contrast taking the values 
{-1,0, 1}. The duration of one video frame {Time axis) is 19.7 ms/eye. 



Table 11.1. Summarized subject data. The two columns at the left show two study groups 
(MS and Normals) and a number of subjects. VEPs were recorded for 8 MS and 12 Normal 
men (see Sex). Duration of MS indicates the age of the disease. During this time patients had 
approximately 9.46 clinical attacks (see N of attacks). 26 subjects had optical neuritis (ON 
colunm) and for 16 of them CSF test was positive (CSF). MS type was Relapsing Remitting 
(RR). 
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2. Data Analysis 

Within each response we analyzed positive and negative time peaks within 
two periods in Normal and MS responses. Normal study group contained 
the first two peaks of all the responses in [59.4 to 99 ms] and [100 to 158 
ms] time periods respectively. Since these two time windows contained the 
first negativity (Nl) and the first positivity (PI), the peaks and their temporal 
windows are called as the N1 and PI peaks and windows [Maddessetal., 2003]. 

The first MS response negativities NIMS ad positivities PI MS were foimd 
in a whole length of response [40 to 300 ms]. To find them we used a special 
criterion crit = 1.45 multiplied by a standard error (SE) of the data. This 
criterion allowed us to estimate the best time peaks at a small error rate. 

2.1. Multivariate regression analysis 

Our goal was to find out whether the MS waveforms could be decomposed 
into several ones, and whether they would be delayed and how much. We used 
a multivariate regression model to find such a decomposition. Coefficients, 
obtained by means of the multivariate regression, described the fit between 
individual MS subject data and averaged Normal data sets. The best-fitted co- 
efficients stated about the possible MS data delay in comparison to the averaged 
Normal data. 

2.2. Discriminant data analysis 

The objective of this analysis was to determine whether the structure of the 
data permitted a method that was able to discriminate normal subjects fi'om 
those who suffered from MS. We employed two types of discriminant analy- 
sis: Linear (LDA) and Quadratic discriminant analysis (QDA) [Johnson and 
Wichem, 1992]. 

3. Results 

3.1. General findings 

We obtained four repeats of the four response data sets, for each of the 
16-stimulus regions (eight per eye). The sample results are shown in the Fig- 
ure 11.3. 

The left panel of Figure 11.3 represents the typical Normal Pattern Pulse 
responses, and the right panel the ON Pattern Pulse typical responses. 

We observed a significant difference between Normal and MS multifocal 
responses. To examine the difference between Normal and MS data we again 
applied multiple linear regression analysis. Table 11.2 summarizes regression 
results. 
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Figure 11.3. Typical Normals and MS responses, a) The two panels on the left (Pattern Pulse 
(Norm)) represent the multifocal responses for the left (OS) and for the right (OD) eye. b) The 
two right panels (Pattern Pulse (MS)) show the typical Pattern Pulse MS responses for the left 
(OS) and for the right (OD) eye, for the eight visual field regions (see Fig. 11.1). The eight 
rows of responses in each panel represent eight visual regions. Responses are shown in voltages 
(vertical axis). The horizontal axis indicates latencies in ms. 



Table 11.2. Summarized multivariate linear regression results for MS vs Normals data. The 
fitted N1 peaks for SNR data are presented in dB. The column labelled Condition indicates 
fitted values. The reference condition was MS Binary. The Multiplier column represents the 
corresponding condition multiplicative factor. The conditions Sparse^ * MS^ Sparseie * MS and 
Pattern Pulse * MS indicate interactions between Sparse 4 , Sparseie, Pattern Pulse and MS vs. 
Normals SNR N1 respectively. The coefficient for Sparseie stimulus is largest. According to 
this data we state, that MS data are 10.78 times smaller than Normal for Sparseie stimulus. 
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The data used in the present analysis were N1 absolute values of signal 
to noise ratios (SNR). The responses were converted do dB before fitting the 
model. Fitting simultaneously the Normal and MS study groups N 1 interactions 
between each of sparseness, and a Superior visual field effect provided the most 
parsimonious model. For the maxima the variance accoimted was = 0.541. 
The reference condition (—26.01 dB) corresponds to a Binary Normal data. The 
coefficients for Sparse 4 MS (0.92 dB), Sparseie MS (—10.78 dB) and Pattern 
Pulse MS (—5.68 dB) corresponds to decreases of the responses for MS subjects 
group (compared with Normal data sets) from Binary figure by 1.12, 0.28 and 
1.92 times respectively. We found that the MS data were significantly smaller 
than the Normal responses. 

3.2. Delays 

MS patients tended to have delayed VEPs. Previous investigations showed 
that VEPs latencies were especially prolonged for ON subjects [Hood et al., 
2000b, Hood et al., 2000a]. In order to find out, whether our MS data had pro- 
longed latencies or consisted of several waveforms, we applied the multivariate 
regression method, described in the Methods. The significance level of delay 
was determined by means of regression coefficient andp-level (p < 0.05). 

The best-fitted delays Ti were estimated for all MS subjects. Figure 11.4 
illustrates the regression results for only 26 ON subjects. The picture represents 
the averaged delays (in video frames, ~ 9.9 ms) for all stimulated regions and 
stimuli. All ON responses were delayed at least per one video fi'ame. The 
largest delays (3 video fi^ames) were obtained for Pattern Pulse visual stimuli. 

We also estimated the second fitted delay T 2 . We foimd that only 18 ON 
subjects had the second significant (p < 0.05) delay. Amongst those 18 subjects 
the second delay was found in regions 2, 4, 7 and 8 for the right eye. The 
averaged T 2 were: ~ 3.6, 3.9, 3.0 and 4.2 video frames for the regions 2, 4, 7 
and 8, for all of sparseness respectively. 

At the next stage we compared the fitted delays together with the real MS 
data time to peaks (the estimation ofNlMS, NTms> PIms and PTms is described 
in Methods). The NTms were considered. We estimated the longest delay Tf, 
which was picked firom Ti or T 2 . I.e. if the data contained more than one- 
fitted delays, the longest one was chosen. The artificial data sets NTp were 
created for all MS subjects, by adding together the longest fitted delays TF and 
Normal averaged first negativity time to peaks (NT). We expected to obtain 
an approximate MS time to peak values, which could be compared with the 
real one NTms- The fitted NTp were subtracted from the NTms data sets. The 
differences ranged between —0.5 and 4 video firames, however they varied for 
different regions. 
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Figure 11.4. The first fitted delay Ti. The top panel (OS) represents the averaged Ti for the 
left eye. The panel (OD) shows the Ti for the right eye. The delays are averaged across all ON 
subjects, and are shown for each of temporal sparseness. The vertical axis shows the delay in 
video frames (1 video frame = 9.9 ms). The horizontal axis symbolizes regions 1 to 8 for each 
eye. 



3.3. Discriminant data analysis 

We next examined the specificity and sensitivity of ON VEP’s responses. 
The first involved constructing models on the measures, obtained from each 
data group. We examined the effect of Nl, PI (see Methods), their implicit 
times, as well as the artificially obtained delays. Sensitivity and specificity was 
examined for each eye and region separately and together. One-eye responses 
were of little use. 

The first model was constructed from Nl , PI and relevant time to peaks (NT 
and PT respectively) data sets. The LDA and QDA specificities and sensitivities 
were tested for each sparseness, eye and regions. The best specificities and 
sensitivities obtained were [91.3% 89.47%] for LDA and [94.7% 94.7%] for 
QDA. 

We have also tested ON delays, obtained by using the multivariate regres- 
sion (see Methods). We constructed artificial MS data sets by adding together 
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averaged Normal data NT means and delayed components Ti or T 2 . A new 
discriminant model contained N 1 , NT, P 1 , PT values and one of those artificial 
data sets (NTi or NT 2 ) . The last combination was little of use in comparison 
with the NTi. In this case the sensitivity and specificity increased till 100% for 
both LDA and QDA models. 

4. Discussion 

VEPs were assessed in four different levels of sparseness. To estimate pos- 
sible delays in MS data we employed the multivariate regression model, which 
was described in Methods. We found that ON multifocal VEPs were delayed 
in ~ 30ms than Normal responses. The delays varied between eyes. This hap- 
pened because of a different level of optic neuritis. Our results showed bigger 
delays for sparser stimuli (Fig. 1 1.4). The delay varied for the different visual 
regions as well. 

After applying the different discriminant models, we found high sensitivities 
and specificities for the model incorporating the PI, the Nl, implicit times and 
the first fitted delay Ti. The best specificities and sensitivities were obtained 
again for the sparser stimuli. We also foimd it was worth combining the data of 
two eyes. In this case the performances reached [94.4% 94.7%] for LDA and 
[100% 100%] for QDA model. 

Even though the simpler linear discriminant model performed at 100%, the 
more complex quadratic discriminant models require further verification. The 
latencies analysis showed that none of the regions had a very significant or 
different delay. Thus, fact that some regions give better performance can be 
explained as a random factor, dependent on nature of the data. 
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Abstract Epilepsy is a common neurological disorder characterized by recurrent seizures, 
most of which appear to occur spontaneously as a result of complex dynami- 
cal interactions among many regions of the brain. The most common type of 
epilepsy in adults is temporal lobe epilepsy. Employment of nonlinear dynamics 
techniques, based on the chaos theory, led us to the hypotheses that (1) seizures 
are a transition from spatiotemporal chaos to a more ordered ictal (symptomatic) 
state, (2) during the interictal (asymptomatic) state, the epileptogenic focus is dy- 
namically isolated from other areas of the cerebral hemispheres, (3) the seizure 
discharge can occur only after a preictal transition state during which the focus 
starts to interact with other cortical areas (preictal state), and (4) the seizure serves 
to reset the brain, reversing the pathological interaction among critical cortical 
sites, thus dynamically isolating the seizure onset zone. 

Through the analysis of long-term intracranial EEG recordings obtained in 
patients with medically intractable seizures, we observed: (1) drop in values of 
a dynamical measure (STLmax) during the seizure, indicating increased tempo- 
ral order of the EEG signal - hypothesis 1, (2) convergence of STLmax values 
among almost all electrode pairs, indicating increased spatial order - hypothesis 
1 , (3) generally higher values of T-index (divergent values of STLmax) during the 
interictal state when the epileptogenic focus is compared to other sites - hypoth- 
esis 2, (4) the average differences in STLmax values between the epileptogenic 
hippocampus and most other sites is typically reduced (low T-index values, “dy- 
namical entrainment”) prior to each seizure - hypothesis 3, and (5) a divergence 
of STLmax (increase in the T-index) between the epileptogenic hippocampus 
and other cortical sites after each seizure - hypothesis 4. These observations sup- 
port our hypotheses regarding the preictal transitions in temporal lobe epilepsy. 
We anticipate that these observations will lead to a better understanding of the 
physiological processes involved in temporal lobe epilepsy. 

Keywords: Temporal Lobe Epilepsy, Short-Term Maximum Lyapunov exponents, T-Index, 

Entrainment Transition 



1. Introduction 

Epilepsy is one of the most common neurological disorders in man. Tem- 
poral lobe epilepsy is the most common type of epilepsy in adults [11]. In this 
disorder, seizures usually begin as paroxysmal electrical discharges in the hip- 
pocampus. The discharges often spread first to ipsilateral, then to contralateral 
cerebral cortex. These abnormal discharges result in a variety of intermittent 
clinical phenomena, including motor, sensory, affective, cognitive, autonomic 
and psychic sjmiptomatology. This type of epilepsy often is resistant to med- 
ical therapy. In medically refi'actory cases, surgical excision of the seizure 
focus may be effective means of seizure control. In some surgical candidates, 
electrographic recordings are obtained for diagnostic purposes from subdural 
electrodes placed over the firontal and temporal cortex and depth electrodes 
implanted in the hippocampi bilaterally [29]. Such recordings offer a unique 



Spatiotemporal Transitions in Temporal Lobe Epilepsy 



225 



opportunity for research into electrophysiological processes of epileptogenesis 
in man. 

In human epilepsy of mesial temporal origin, seizures beginning in the hip- 
pocampus are often propagated throughout the brain. The temporal cortex, 
limbic structures and orbitofrontal cortex appear to play a critical role in the 
onset and spread of these seizures [29]. Most medically intractable complex 
partial seizures originate from, or are elaborated in the hippocampus [8,35,36]. 
The cellular mechanisms underl)dng seizures of hippocampal origin in humans 
are not completely understood. However, it has been established that there 
are characteristic patterns of neuronal loss and dendritic damage associated 
with alterations in neurotransmitter receptor densities in the epileptogenic hip- 
pocampus [2, 3, 26, 28, 34]. It is likely that these structural changes disrupt the 
normal excitatory and inhibitory feedback circuits in the hippocampus, leading 
to disturbances in its dynamical behavior. A central feature of the epilepto- 
genic hippocampus is the tendency to make abrupt transitions to well organized 
oscillations, characteristic of a seizure. 

Successful surgical treatment of seizures of focal origin depends upon ac- 
curate presurgical localization of the epileptogenic focus. However, in some 
cases, accurate localization is not possible with noninvasive techniques. In 
such cases, electrographic recordings with surgically implanted subdural and/or 
depth electrodes are employed to obtain accurate localization. This procedure 
typically requires several days to weeks of hospital stay during which anticon- 
vulsant drugs are withdrawn in order to allow seizures to occur. Typically, it is 
necessary to record 3 or more representative seizures in order to identify and 
localize the seizure focus. This time consuming and expensive process could 
be shortened considerably if the epileptogenic focus could be identified through 
analysis of the interictal electrographic signal. 

The benefit of defining an epileptogenic focus through analysis of the inter- 
ictal electrographic signal is clear. For every recorded seizure, there are many 
hours of interictal signal. The use of the interictal signal to localize epilep- 
togenic foci requires identifying specific characteristics of the interictal signal 
generated by the epileptogenic focus. This objective has not been possible 
through visual inspection of the EEG. However, by detecting specific quantita- 
tive characteristics of the electrographic signal generated by the epileptogenic 
focus during the interictal state, it may be possible to identify and localize 
seizure foci. Our work to date strongly supports the view that the interictal 
signal generated by an epileptogenic focus has specific features that can be de- 
tected and quantified by anal 3 dic measures developed for the study of complex 
nonlinear systems. 

Over the past several years, our group has sought to identify characteristic 
dynamical features of the electrographic signals generated by the epileptogenic 
focus. The possibility that such dynamical features might exist was suggested 
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by the fact that there are structixral abnormalities in the epileptogenic hippocam- 
pus (e.g., neuronal loss, dendritic abnormalities, and axonal sprouting of dentate 
granule cells) [3, 7, 10, 12, 27, 28, 33, 34] and there are localized zones of hy- 
pometabolism detectable with PET scans which are present during the interictal 
state [1,25,32]. Using techniques developed to identify nonlinearities in the 
signal recorded from depth and subdural electrodes in patients with temporal 
lobe epilepsy, we found that signals generated by the epileptogenic focus were 
strongly nonlinear [4-6]. We also found nonlinearities in signals generated by 
interictal spike foci located in the temporal cortex contralateral to the epilepto- 
genic focus. 

Traditionally, the initial occmrence of characteristic focal rhythmic EEG 
discharge, most commonly in the hippocampus or mesial temporal cortex, is 
considered to be the onset of a seizure. However, we have discovered, through 
analysis of the spatiotemporal dynamics of invasive electrographic recordings 
in patients with medically intractable temporal lobe epilepsy, a preictal transi- 
tion process [13-22,24,31]. The onset of this transition precedes the seizure 
for periods of up to 1 hour. This transition was remarkably similar for each 
of the 48 seizures analyzed (4 seizures in each of 12 patients) and was not 
observed in the interictal data examined. The preictal dynamical transition is 
characterized by the progressive convergence of the mean of the STL^ax val- 
ues among specific anatomical areas (mean value entraiiunent) at specific times. 
For each case analyzed, this gradual preictal entraiiunent process culminated in 
a seizure. The discovery of a preictal transition period that can be detected by 
its quantitative dynamical characteristics offers the possibility of predicting an 
impending seizure in time to intervene therapeutically in order to abort the tran- 
sition and prevent a seizure from occurring. If seiziues can be detected minutes 
in advance, it could lead to the development of novel therapeutic approaches 
designed to disrupt the preictal transition process such as by electrical stimu- 
lation (e.g. vagal nerve stimulation) or by timely release of an anticonvulsant 
drug. 

In this chapter, we emphasize the use of STLmax as a measure of the dy- 
namical state of the electrographic recordings to investigate the interactions 
among brain areas during the interictal, preictal, ictal and postictal states. The 
Lyapunov exponent is a measure of the rate of production or destruction of 
information (usually expressed in bits per second) and is an indicator of how 
ordered (negative Lyapunov exponent) or chaotic (positive Lyapunov exponent) 
the series or the system is over a given time period. There are well established 
techniques for estimating the value of Lmax in a time series. The EEG signal 
can be considered as a time series generated by a multi-dimensional system, the 
brain. In a multi-dimensional signal, as many Lyapunov exponents can be de- 
fined as there are dimensions [9]. A chaotic signal, by definition, has at least one 
positive Lyapunov exponent. The largest Lyapunov exponent (Lmax) can be 
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used to partially characterize the dynamical steady state of a physical system. In 
this application, the maximum Lyapimov exponent was estimated sequentially 
in each non-overlapping 10.24 sec EEG recording for each recorded cortical 
site. The interactions among brain areas were then investigated by generating 
the pair-T statistic time series of the maximum Lyapunov exponent values. We 
will utilize these observations to test the hypotheses: (1) seizures are a transition 
from spatiotemporal chaos to a more ordered ictal (symptomatic) state, (2) dur- 
ing the interictal (asymptomatic) state, the epileptogenic focus is dynamically 
isolated from other areas of the cerebral hemispheres, (3) the seizure discharge 
can occxu only after a preictal transition state during which the focus starts to 
interact with other cortical areas (preictal state), and (4) the seizure serves to 
reset the brain, reversing the pathological interaction among critical cortical 
sites, thus dynamically isolating the seizure onset zone. 

The rest of the chapter is organized as follows: In section 2, the algorithms 
for estimating STLmax and pair-T statistic are described. In section 3, the 
analyzed EEG recordings and the results of the analysis are presented. Section 
4 gives the discussion of the results. 

2. Nonlinear Dynamical and Statistical Measures 
2.1. Nonlinear Dynamical Measure: 

lasemidis and Sackellares [22] applied the method of delays developed by 
Packard et al. and Takens [30,37] to reconstruct a multidimensional state space 
from a single-channel EEG signal. After the reconstruction by an embedding, 
each state is represented in the state space by a vector Yt whose components 
are the delayed versions of the original single-channel EEG time series ut, that 
is: 

Yt = [ut, Ut-r, . . . , Ut_(p_i).r] (1) 

where Tf is a vector in the state space at time t, r is the time delay between 
successive components of Yt, and p is the embedding dimension of the recon- 
structed state space. The embedding dimension p is the dimension of the state 
space that contains the steady state of the system (i.e. attractor) and it is always 
a positive integer. On the other hand, the attractors’ dimension D may be a 
positive non-integer (fractal). D is directly related to the number of variables 
of the system and is usually inversely related to the existing coupling among 
them. According to Takens [37], the embedding dimension p should be at least 
equal to {2D + 1) in order to correctly embed an attractor in the state space. 
Each of many different methods used to estimate D of an object in the state 
space has its own practical problems. The measure most often used to estimate 
D is the state space correlation dimension i/. Methods for calculating i/ from 
experimental data have been described and were employed in our work to ap- 
proximate D of the epileptic attractor. In the EEG data we have analyzed to 
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date, V is found to be between 2 and 3 during an epileptic seizure. Therefore, 
in order to capture characteristics of the epileptic attractor, we have used an 
embedding dimension p of 7 for the reconstruction of the state space. 

Since the brain is a nonstationary system, algorithms used to estimate mea- 
sures of the brain dynamics should be capable of automatically identifying and 
appropriately weighing existing transients in the data. The method we devel- 
oped for estimation of Lmax for nonstationary data, called STL (Short-Term 
Lyapxmov), considers possible nonstationarities in the EEG. This method was 
expained in details in lasemidis et al. [22] We apply the STL algorithm to EEG 
tracings from electrodes in multiple brain sites, to create a set of ST Lmax time 
series. This set of time series contains local (in time and in space) information 
about the brain as a dynamical system. It has been shown that it is at this level 
of spatiotemporal analysis that reliable detection of the transition to epileptic 
seizures, long before they actually occur, is derived [15,23,24,31]. 

2.2. Statistical Determination of Dynamical Entrainment 
among Brain Areas: 

Dynamical entrainment is defined as the convergence of STLmax values 
among the electrode sites within a window. This convergence is quantified by 
the average of pair-T statistics over all pairs among the group of sites. We 
defined this value as average T-index. The calculation of a pair-T statistic is 
described as follows: 

For electrode sites i and j, let their STLmax values in a window Wt of n 
STLmax points are 





Qrn Tt-\-n—l\ 
> ^ ^maxi J 


(2) 




.,STL*+^-^} 

' TfviJLJbj J 


(3) 






(4) 



Then, the pair-T statistic at time window Wt between electrode sites i and j 
is calculated by 



rpt 

J-ij 




(5) 



where Djj and are the sample mean value and the sample standard deviation 

ofC!,. 

If the true mean of Dj^, denoted by is equal to zero, and the assumptions 

of D\j being independent and normal-distributed are valid [23], asymptotically, 
Tfj is distributed as a t-distribution with n — 1 degrees of freedom. 
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If we define the disentrainment between electrode sites i and j as pjj is 
significantly different from zero with significance level a, the disentrainment 
between two electrode sites can then be detected by the pair-T test as; Electrode 
sties i and j are disentrained if > ta/ 2 ,n-i> where ta/ 2 ,n-i is the 100(1 — 
a/2)% critical value of t-distribution with n — 1 degrees of fireedom. If Tj?- < 
ta/ 2 ,n - 1 > which means that the differences of STLmax values between electrode 
sites i and j in the time window Wt do not have sufficient evidence to claim the 
disentrainment between these two sites, in this situation, we will consider sites 
i and j are entrained each other in Wt- 

T-index profiles were generated over time for all possible pairs. Average T- 
index curve between two brain areas was created by averaging all the possible 
pairs between these two brain areas. For example, the average T-index curve 
between LTD (left temporal depth) and RTD (right temporal depth) over time 
is the average of all possible pairs (6 LTD’s x 6 RTD’s = 36 pairs) between 
these two areas. When the average T-index is small, it indicates that the two 
brain areas are more dynamically interacted (entrained) to each other at that 
particular time window with respect to their STLmax values. 

3. Results 

Electrographic recordings firom bilaterally, surgically implanted microelec- 
trodes in the hippocampus, temporal and frontal lobe cortexes of an epilep- 
tic patient with temporal lobe, complex, focal, with secondarily generalized 
epileptic seizures, was analyzed (see Figme 1 2. 1 for our typical electrode mon- 
tage). The EEG signals were recorded using amplifiers with an input range of 
±0.6mV, and a frequency range of 0.5 ~ 70 Hz. Prior to storage, the signals 
were sampled at 200 Hz using an analog to digital (A/D) converter with 10-bit 
quantization. The multi-electrode EEG signals (28 common reference chan- 
nels) were obtained from long-term (1 1.67 hours) continuous recordings fi'om 
this patient. Five seizures of mesial temporal onset were recorded during the 
period of recordings. This EEG recording was viewed by two independent elec- 
troencephalographers to determine the number and type of recorded seizures, 
seizure onset and end times, and seizure onset zones. The area RTD (right 
temporal depth) was identified as the epileptogenic focal area of this patient. 

Figure 12.2 demonstrates a typical STLmax profile from an epileptogenic 
focal cortical site over 3 hours including two seizures and 1 hour after the 
second seizure. The STLmax values remain high during the interictal period 
(more chaotic), gradually start decreasing approximately 30 minutes before the 
seizure (more ordered), and drop to the lowest points during the ictal periods 
(most ordered). These observations support the hypothesis that seizures are a 
transition fi-om temporal chaos to a more ordered ictal (symptomatic) state. 
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Figure 12.1. Graphic illustration of placement of subdural electrode strips and depth electrodes 
used for long-term diagnostic intracranial EEG recordings. Electrode strips are placed over 
the left orbitofrontal (LOF), right orbitofrontal (ROF), and left subtemporal (LST) and right 
subtemporal (RST) cortex. Depth electrodes are placed in the left temporal (LTD) and right 
temporal (RTD) lobes to record electrical activity generated by the amygdala and hippocampus. 



Figure 12.3 shows the percentage of entrained electrode pairs matched from 
different brain areas with significance level a = 0.05. Four electrodes from 
each area were included in the analysis. Therefore, there are total 240 electrode 
pairs. The figure showed that during the ictal periods, the percentage of en- 
trained electrode pairs significantly increases to the maximmn values, and drop 
to the lowest points approximately 10 minutes after the seizure. The observa- 
tion of massive convergence of STLmax (entrainment) during the ictal period 
indicates increased spatial order and thus supports our hypothesis 1 . Further, the 
observation that less entrainment occurs after the seizure indicates that seizure 
serves to reset the brain, reversing the pathological interaction among critical 
cortical sites (hypothesis 4). 

Figures 12.4 and 12.5 show the T-index profiles between the epileptogenic 
focus (RTD/RST) and other areas (LTD, LST, LOF and ROF) of the cerebral 
hemispheres. During the interictal state, generally higher values of T-index 
(divergent values of STLmax) indicate that the epileptogenic focus is dynam- 
ically isolated from other areas of the cerebral hemispheres. This observation 
supports our hypothesis 2. Further, the average differences in STLmax val- 
ues between the epileptogenic hippocampus and most other sites are typically 
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Figure 12.2. STLmax profiles over 3 hours including two seizures and 1 hour after the second 
seizure. Using embedding dimension p=7 and time delay r=20 msec for the state space recon- 
struction, the STLmax values were estimated by dividing the EEG signal into non-overlapping 
epochs of 10.24 seconds each. 
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Figure 12.3. Percentage of entrained (a = 0.05) electrode pairs matched in different brain 
areas. 
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Figure 12.4. T-index profiles between RTD and other areas of the cerebral hemispheres over 
12 hours including 5 seizures: (a) RTD versus LTD, (b) RTD versus LST, (c) RTD versus LOF, 
and (d) RTD versus ROF. 
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Figure 12.5. T-index profiles between RST and other areas of the cerebral hemispheres over 
12 hours including 5 seizures: (a) RST versus LTD, (b) RST versus LST, (c) RST versus LOF, 
and (d) RST versus ROF. 
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reduced (low T-index values, “dynamical entrainment”) prior to each seizure, 
which supports the hypothesis 3. The fourth hypothesis is also supported by 
a divergence of STLmax (increase in the T-index) between the epileptogenic 
hippocampus and other cortical sites after each seizure. In other words, the 
dynamical entrainment between the epileptogenic focus and other areas is reset 
(disentrained) by the occmrences of the seizures. 

4. Discussion 

In this chapter, we hypothesize the state transitions of the temporal lobe 
epilepsy during the interictal, preictal, ictal and postictal states. These hy- 
potheses are supported by the observations based on the dynamical and sta- 
tistical analysis of the EEG signals. These studies suggest that a seizure is a 
spatiotemporal transition and occurs only when a sufficiently large area of cere- 
brum becomes entrained dynamically with the epileptogenic focus. Although 
these observations could lead us toward the understanding of the transitions 
in temporal lobe epilepsy, however, the questions such as how the dynamical 
entrainment occurs and how it develops prior to a seizure are still remained 
unanswered. Fully rmderstanding of these mechanisms is the key to develop 
a reliable epileptic seizure predictor and to control or prevent an impending 
seizure well before its actual onset. 
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Abstract Epilqjsy is a dynamical disorder of the brain. Initial results obtained, from a 
small sample of patients, by employing signal processing techniques, based on 
the theory of nonlinear dynamics and statistics, led us to hypothesize that the 
dynamical entrainment (i.e., their dynamical measures are gradually convergent 
in time) among the critical cortical brain sites can be used to anticipate an im- 
pending seizure [10, 17]. In this chapter, we present the results from a larger 
sample of patients and seizures to confirm this hypothesis. 

Through the analysis of long-term intracranial EEG recordings obtained in 10 
patients with 118 medically intractable seizures, we observed that seizures were 
preceded by a preictal transition that evolves over approximately 54 minutes. This 
transition is followed by a seizure. The study of this process has been hampered 
by its complexity and variability. A major problem is that the transitions involve 
a subset of brain sites that vary from seizure to seizure, even in the same patient, 
which is expected since the state of the patient may be different before each 
seizure. However, by combining dynamical analytic techniques with a critical 
cortical site selection method, we have been able to elucidate important dynamical 
characteristics underlying human epilepsy. We anticipate that these observations 
will lead to a better understanding of the physiological processes involved. We 
illustrate the use of these approaches in confirming our hypotheses regarding 
entrainment characteristics prior the seizure. Thus, it may be possible to develop 
novel therapeutic approaches involving carefully timed interventions to prevent 
the patients from the occurrence of a seizure. 

Keywords : Epilepsy, nonlinear dynamics, Lyapunov exponents, T-index, Preictal Transition, 

Seizure predictability 



1. Introduction 

Epilepsy is the most common serious brain disorder in every country of the 
world. It may be the most imiversal of all medical disorders, affecting all ages, 
races, social classes, and nations [18]. Worldwide, at least 40 million or 7 of 
every 1,000 individuals ciurently suffer from epilepsy [2]. Estimates of inci- 
dence rates range from 24 to 53 per 100,000 [1, 3, 8, 9, 19, 21]. The mainstay 
of contemporary treatment for epilepsy is pharmacological. Anticonvulsant 
drugs are taken daily, in fixed doses, and are titrated to achieve a steady-state 
concentration in the blood. One problem with chronic daily dosing with anti- 
convulsant drugs is that many patients develop a tolerance to the anticonvulsant 
effect. This is particularly true for the most powerful class of anticonvulsants, 
the benzodiazepines. During the past decade several new anticonvulsants have 
been available for adults and children with epilepsy. Nonetheless, approxi- 
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mately 25% of these individuals have seizures that are refractory to medical 
therapy [5]. For these patients, surgical treatment may be an option. Surgical 
treatment can be effective in carefully selected cases. Good responses occur in 
approximately 70 to 90% of adults with temporal lobe epilepsy [5]. However, 
therapeutic response rates drop off markedly in individuals with more than one 
epileptogenic focus or those with generalized seizures. 

Among the most disabling aspects of epilepsy are the anticipation of seizures 
and the uncertainty of when the next seizure will occur. Patients with epilepsy 
usually appear normal and function normally for much of the time. Then 
suddenly a seizure occurs. The central questions as to why seizures occur 
intermittently, and when they begin and end, remain imanswered. 

Our previous studies have shown that seizures are not abrupt events; instead, 
they follow a dynamical transition that evolves over minutes to hours before 
the seizures [10-17,22]. During this preictal dynamical transition, multiple 
regions of the cerebral cortex progressively approach a similar dynamical state. 
Other investigations have confirmed the presence of a progressive preictal tran- 
sition [4,6,7,20]. For seizures of frontal and temporal lobe origin, the dynamical 
transition involves the gradual convergence of the dynamical characteristics of 
multiple areas of the cerebral cortex toward a similar state. This preictal con- 
vergence may involve different brain regions from seizure to seizure even in the 
same patient. Therefore, acciuate detection of the preictal transition depends 
upon selecting the appropriate cortical sites. In this chapter, we present a pre- 
ictal transition detection algorithm by combining the nonlinear dynamics and 
statistical analyses with a critical cortical sites selection method. 

The rest of the chapter is organized as follows; In section 2, the measures 
and the methodology employed for testing the seizure predictability hypotheses 
are described. In section 3, the EEG data and the results of the data analysis 
are presented. Discussion of the results is provided in final section 4. 

2. Measures and Methods 
2.1. Dynamical Measure 

We utilized an estimate of the Short-Term Maximum Lyapunov exponent 
(STLmax) as the dynamical measure of the electroencephalogram. Estimation 
of STLjnax was calculated by dividing the EEG signal into non-overlapping 
segments of 10.24-sec each. The largest Lyapunov exponent {Lmax or Li) is 
defined as the average of local Lyapunov exponents Ly in the state space, that 
is; 




( 1 ) 
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where is the total number of the local Lyapunov exponents that are 
estimated from the evolution of adjacent points (vectors) in the state space. 
Xi = X(ti),Xj=X{tj),and 



= J_ ^ \X{ti + At)-X{tj + m 
At \x(U)-X{tj)\ 



( 2 ) 



where At is the evolution time allowed for the vector difference \X{ti) — 
X(tj)l to evolve to the new difference |X(tj+At)— X(tj+At)|. If At is given 
in sec, then STLmax should be in bits/sec. More details regarding STLmax 
canbefoundinlasemidisetal., 1990 [16], Figure 13.1 shows a 20-minute EEG 
recording, 10 minutes before and after a seizure onset, and its corresponding 
ST Ljjidx curve. 



2.2. Statistical T-index 

We used the T-index (from the statistical paired-T test) to measure the degree 
of entrainment (convergence with respect to the STLmax values) between cor- 
tical sites. The T-index of a pair of sites was calculated in each 10-min sliding 
window (60 STLmax segments) by dividing the mean difference of STLmax 
values between the two cortical sites by its standard deviation. That is, the T- 
index at time t between cortical sites i and j is defined as; 

~ X \E{STLmax,i ~ ^'^^maxj}{/(^i,ji^) (3) 

where E{STLmax,i{t) — STLmaxjit)} denotes the average difference of 
STLmax between electrode sites i and j, within a 10-min time window, and 
cTij(t) is the sample standard deviation of the differences. 

2.3. Critical Cortical Sites Selections 

One of the most important tasks to detect the dynamical transitions is to 
identify the most possible group of cortical sites which will participate in the 
preictal transition of an impending seizure. Here we identify the most critical 
group of cortical sites based on the dynamical entrainment (T-index values) in 
the 10-min time window before the seizure. More specifically, the algorithm 
selects the group of cortical sites which are most entrained (minimum average 
T-indices) prior to the seizure. This task can be easily accomplished by creating 
a T-index matrix before the seizure. The objective in this chapter is to confirm 
the hypothesis that a preictal transition can be detected by observing the average 
T-index curve ofthe identified most critical group of cortical sites. Figure 13.2 
show some examples of T-index curve from the identified most critical cortical 
sites before seizures. It is noticed that the T-index cxuwes gradually decrease 
and drop below a critical value tens of minutes before seizures. These critical 
cortical sites remain entrained until the seizure occurs. 
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Figure 13.1. EEG and STLmax profiles over 20 minutes including a 1.5-minute ictal period. 
The estimation of the STLmax values was made by dividing the EEG signal into non-overlapping 
segments of 10.24 seconds each, using embedding dimension p = 7 and time delay r = 20 
msec for the state space reconstruction. 
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Figure 15.2. Four T-index curves from the identified groups of critical cortical sites before 
seizures. 



2.4. Detection of Preictal Transitions 

Based on the T-index curve described above, the decision of whether a preictal 
transition is detectable is made by the following steps; 

(1) Sequentially averaging the T-index values, backwards in time from the 

seizure onset, to identify the first time point where the averaged T-index 
curve is above the critical value T^ (from the T-distribution for a given 
significance level a). Then, define the duration of the preictal transition 
(DPTa) as the time interval from the previously identified time point to 
the seizure onset time point. 

(2) After determining the DPTa, the moving window length is set equal to 

DPTa. The average T-index values within each overlapping moving 
window is then estimated backwards in time up to 2 hours prior to the 
seizure’s onset. A false positive is observed if the average T-index within 
one of the windows is less than the critical value Tq, that was used to 
determine the DPTa. 
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An Undetectable A Detectable 





Figure 13.3. Illustrations of a detectable and an undetectable preictal transitions. 



(3) Two critical values, a\ and a 2 , are used in this method. The hypothesis 
that the preictal transition can be detected is rejected if false positives 
are observed for both critical values either at the same or different time 
points in the available interval prior to the seizure. Otherwise, the preictal 
transition is considered detected. Figure 13.3 shows the illustrations of 
a detectable and an undetectable preictal transitions. 



3. Results 

In this section, results from the application of the previously described 
method to detect the preictal transitions of epileptic seizures are shown. The 
method was applied to 118 epileptic seizures in 10 patients. Table 13.1 sum- 
maries the statistics of the analyzed EEG data sets. 

Table 13.2 gives the summary of the results of this analysis for all 118 
seizures. The results show that more than 85% of the seizures have detectable 



246 



QUANTITATIVE NEUROSCIENCE 



Table 13. 1. Statistics of ten analyzed EEG data sets from ten patients. RTL = Right Temporal 
Lobe, LTL = Left Temporal Lobe. 



Patient 


Epileptic Focus 


Total tt of Seizures 


Mean Seizure Interval (Hrs) 




RTL 


24 


3.62 


#2 


RTL 


19 


7.77 


#3 


LTL 


8 


3.23 


#4 


LTL 


4 


2.17 


#5 


RTL 


3 


4.15 


tie 


RTL 


7 


20.32 


#7 


RTL 


9 


8.69 


Its 


RTL 


17 


17.10 


119 


RTL 


18 


15.30 


|io 


RTL 


9 


7.59 



Total 



118 



9.69 



Table 13.2. Summary of the preictal transition detection analysis on 1 18 epileptic seizures in 
10 patients. 



Patient 


Total tt of 
Seizure 


Predictable 

Seizures 


Detectability of Preictal 
Transitions (%) 


Average Prediction 
Time (min.) 


tti 


24 


21 


87.5 


66.9 


#2 


19 


17 


89.5 


29.8 


#3 


8 


8 


100.0 


49.5 


#4 


4 


4 


100.0 


44.1 


B5 


3 


3 


100.0 


34.4 


#6 


7 


4 


57.1 


43.3 


1)7 


9 


8 


88.9 


53.7 


#8 


17 


15 


88.2 


69.6 


t)9 


18 


13 


72.7 


60.8 


|10 


9 


8 


88.9 


49.4 



Total 



118 



101 



85.6 



53.9 
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preictal transitions. The average time interval of preictal transitions is approx- 
imately 54 minutes. 

4. Discussions 

This study was undertaken to confirm the hypothesis that it is possible to pre- 
dict temporal lobe epileptic seizures by the analysis of dynamical and statistical 
characteristics of multi-chanonel EEG signals recorded from multiple cortical 
sites. Our previous studies from a small sample of patients have indicated the 
existence of preictal transitions starting 30 ~ 60 minutes before a seizure on- 
set, in which the values of Lyapimov exponents of EEG recorded from critical 
cortical sites become convergent. However, the cortical sites involved in this 
dynamical transition vary from seizure to seizure. Thus, the ability to identify 
the critical cortical sites that participate in the preictal transition plays a key 
role for the prediction of an impending seizure. 

By employing a critical cortical site selection method which selects the most 
entrained group of cortical sites prior to a seiznire, we demonstrated that 85.6% 
of the 1 1 8 seizures analyzed can be anticipated by the detection of their preictal 
transitions. These transitions can be observed on an average of approximately 
54 minutes before the seizure onsets. These results, from a larger sample of 
patients and seizures, not only confirm our previous findings of the existence of 
the preictal transitions, but also give us more confidence to develop computer 
based automatic on-line, real time seizure prediction systems. Such systems not 
only can be used to enhance patient safety and treatment by alerting the nursing 
and technical staff of impending seizures, but also could provide promise for 
new diagnostic applications and novel approaches to seizure control. 
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Abstract To be able to predict a future event is the most convincing side of science. Usu- 
ally in the beginning of an investigation, the prediction is not perfect, i.e., an 
event may be missing or a prediction turns out to be a false alarm. When past 
prediction records are available, can we determine whether the prediction scheme 
is promising? In this paper, we use the naive optimal prediction as a yard stick, 
i.e., testing whether the new prediction scheme is better than the naive optimal 
scheme through statistical hypothesis testing. Here the naive scheme is defined 
as we know only the distribution of the inter-arrival times without any other 
auxiliary information. We use the trade off curve between false alarm rate and 
sensitivity to measure the prediction performance and the bootstrap method to 
compute the p-value. Real data and simulation examples are presented. 
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1. Introduction and Modeling 

To be able to predict the occxirrence of a disease episode has many impor- 
tant clinical applications. The onset of stroke, hear attack, or severe pain are 
examples. The particixlar example that motivates this paper is the prediction of 
seizures from epilepsy patients. Epilepsy is among the most common disorders 
of the nervous system. The incidence rates range from 24 to 53 per 100,000 
per year [3, 6 ]. For epilepsy patients, the ill interferes with normal life and 
sometimes fatally only when the a seizure occurs. Thus, any methods that can 
predict its occurrence would be a great clinical achievement. Unfortunately, 
there still no method at this moment that can predict seizure with high relia- 
bility [7, 9]. Thus, we are seeking valid methods that may eventually lead to a 
clinically useftil prediction. How do we quantify a valid method? In this paper, 
we propose a method that will show whether a given method is better than a 
naive method based on some trivial information. In our case, the Lyapimov 
exponent aspect of EEG is used to predict seizures, but the prediction is far 
from perfect. Is Lyapunov exponent of EEG really useful in prediction? 

To measure prediction accuracy is not an easy matter. An obvious measure is 
the prediction error defined by the distance between the predicted time and the 
real occurrence time such as the mean square error used in time series prediction. 
The top scenario of Fig. 14.1 shows this possibility. We can use either \yi — U], 
or 1 2 /i - tip to measure the prediction error. However, this definition does not 
work for the other two scenarios in Fig. 14.1 . In the middle scenario, there 
are three events occurred before the predicted time yi. There seems no way to 
defined the prediction error by time difference. Also, for the bottom scenario, 
when the first prediction yi failed to predict an occurrence, the predictor caimot 
wait indefinitely. He will make a new prediction when there are signs for a new 
event. So he predicts again at j/ 2 , and j/ 3 . It is again difficult to measure the 
prediction error by time difference. 




Figure 14.1. Possible scenarios between real events to, ti, etc. and predicted times yi, j/ 2 , etc. 
The length o is the length of the alert period. 
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Thus a commonly used criterion for event prediction is the trade oif between 
sensitivity and false alarm rate [12]. Roughly speaking, sensitivity is defined as 
the probability of making a correct prediction of an event and the false alarm rate 
is the number of false alarms per unit time. The imit time can be one minute, 
one hour or 24 hours or any time interval. This causes some inconvenience 
because the time unit can be subjective. We define the false alarm rate as 

4> = average false alarm per event 

, . number of false alarms in predicting n events ( 1 ) 

= lim 

n~^oo 77, 

Note that this definition is equivalent to false alarm per unit time when we know 
the mean occurrence time. 

To define sensitivity, we cannot avoid one new parameter, the alert period 
(denoted by a in this paper), i.e., if the event occurs within the alert period 
after the prediction (warning) starts, it is a correct prediction. Otherwise, the 
prediction is a false alarm. With this new parameter, we define the sensitivity 
as 

ijj = Probability an event is correctly predicted. (2) 

Ideally when 4> increases i/' will decrease, but it may not be so if the prediction 
scheme is “strange”. The computation of (t> and ip for a general renewal process 
is given in the Appendix. 

For a fixed false alarm rate, we can compare two prediction schemes by their 
sensitivity, or conversely, compare their false alarm rates at a fixed sensitivity. 
Unfortunately, it is almost impossible to fix the false alarm rate in a sample with 
a small number of events. For example, suppose the three figures in Fig. 14.1 
are three samples from different patients. The estimated ips and 4>s are shown 
in Table 14. 1 . Thus if we fix the false alarm rate at 1 .0, there is no estimates for 
Ip. Thus we choose to use the the trade-off curve in Fig. 14.2 which is similar 
to the OC (operating characteristic) curve. We still use the name OC curve for 
the (p-ip relation. Usually a prediction scheme has many parameters for tuning, 
such as economical consideration. Thus the predictions in Fig. 14.1 are fi"om 
one option that allows a moderate false alarm rate. The predictor can reduce 
the false alarm rate by making less predictions. Hence each sample in Table 
14.1 merely represents a point in the OC diagram. By tuning the prediction 
parameters, we can estimate the entire OC curve. A typical estimated OC curve 
is shown in Fig. 14.3. Actually, this is one curve from real data to be discussed 
in the example section. Intuitively, the OC curve should be a monotonic curve. 
The horizontal part such as section b and the vertical part such as section c in 
Fig. 14.3 can happen in theoretical curves, but the non-monotonic parts d seems 
to be an undesirable small sample property. 

It is obvious that the prediction scheme Si is better than scheme S 2 in 
Fig. 14.2. The superiority is uniform for any fixed <p or ip. But if there is 
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Figure 14.2. Two OC curves by two prediction schemes. The horizontal axis is the false alarm 
rate/event, and the vertical axis is the sensitivity. Si and S 2 are two examples. 




Figure 14.3. An estimated OC curve from a sample. 

crossover between the two curves, then it is difficult to define superiority. Ac- 
tually, there is no reasonable way to define superiority because it depends on 
which false alarm rate you prefer. But in practice, it would be very difficult 
to prove the existence or nonexistence of crossover based on estimated curves. 
We propose to use area above the curve (the upper comer area) A as a measure 
of the performance of prediction scheme, i.e., 

A= ( 3 ) 

Jo 

Note that this area always exists in theory as well as in sample, because if we 
make a prediction immediate after an event and after each prediction failure, 
1 /) = 1 and (f) is bounded by the mean arrival time and alert period a. Thus, the 



Table 14.1. Estimated sensitivity ip and false alarm rate of Fig. 1 4. 1 . 



Estimate 


Top figure 


Middle figure 


Bottom figure 


$ 


0.5 


0.0 


1.0 


•A 


0.5 


2/3 


2.0 
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curve is closed at the right end. Also, if no prediction is made, it is closed at 
the left end. Hence the area above the curve is well defined. Obviously, the 
smaller the A, the better is the prediction scheme. Other areas of statistics also 
use the whole curve as a comparison of different schemes or treatments. It is 
usually called the area imder the curve [1,2, 11]. However, our OC curve is 
different from the traditional specificity in the x-axis, because <f> can be larger 
than 1 . That is why we use the area above the curve, because the best value is 
the well defined number 0. If we used the area under the curve, the best value 
would depend on the range of <f> which can vary. 



2. Naive Prediction Schemes 

If there are existing prediction schemes, we can compare a new scheme with 
the best existing one. But when the predict scheme is the first of its kind, we 
have to justify its merit with some naive or trivial prediction scheme. 

Two naive schemes were suggested by [8], the periodic prediction and ran- 
dom prediction. The former means to predict periodically every so often until 
the next event appears and the latter is to use a random guess for the next event. 
There is some uncertainty in the second scheme because random guess can be 
defined in many different ways. We feel that a nature random prediction is to 
let the next prediction be an exponential random variable with mean yt. When 
we adjust we will get an OC curve. Similarly, we can obtain the OC curve 
for the periodic prediction by varying the period /x. 

The two measures, sensitivity and false alarm rate, can be combined if there 
are costs associate with them, i.e., there is a cost c\ for missing a events and 
a cost C 2 for one false alarm [12]. In this case, the total cost of a prediction 
scheme is 

C = ci{\ - ii) + C 2 (j). (4) 

In general, the optimal scheme depends on the cost and the occurrence pattern. 
However, under a mild condition, if the event arrivals have a non-decreasing 
hazard rate, then the optimal prediction is to predict once and keep constant 
alert thereafter, where the hazard rate is defined as 



\it) = lim 
At-*o 



Pr{The event will happen in (t, t + At) given it has not occurred yet} 
_ 



( 5 ) 



This scheme is optimal regardless of the cost structure in (4). A proof is given 
in the Appendix. Since this is a prediction method that required no prior infor- 
mation of the events, we defined it as the optimal naive prediction scheme. To 
assume non-decreasing hazard rate is quite reasonable for medical events, be- 
cause an acute disease episode usually occurs when the accumulation of certain 
imdesirable waste or fatigue in a tissue exceeds a threshold. Thus the hazard 
increases with time. 
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3. Testing the Hypothesis that the New Method is Better 

The method described here can be used to compared any two prediction 
schemes, but we use a new and the optimal naive prediction methods to illustrate 
the methodology. Suppose there is a sample from a patient. By tuning the 
parameters in prediction scheme, we can estimate the OC curves and area above 
the curve for the new prediction scheme (.An) and for that for the optimal naive 
one (.Ao). If their true areas are and An and Ao, then we wish to test the 
h)^othesis: 

Ho : An = Ao versus Hi : An Ao- ( 6 ) 

Although we are merely interested in the one sided alternative An < Ao, we 
follow the usual caution to do a two sided test in ( 6 ). At this moment, we do not 
know how to find an exact test for ( 6 ). A t-type of test was suggested in [1 1] 
assruning that the two measures on the x- and y- axes have a bivariate normal 
distribution which is irrelevant to our work. We choose the boostrap method, 
i.e., take random sample from the inter-arrival intervals with replacement and 
compute Ao from this sample. The p-value can be found by the position of A„ 
in the lower quantile of all the AoS. More precisely, 

, number of A„ < A„ in bootstrap sampling 

p-value = — ^ ; : (7) 

bootstrap size 

A small p-value indicates that it is unlikely that the optimal naive method can 
reach such a small area (A^) attained by the new method. 

In many practical case, one sample may not contain enough events to pro- 
duce a small enough p-value. Suppose there are n sample with p-values 
Pi,P 2 , ■ ■ ■ ,Pn- We suggest to use the concept of meta analysis to combine 
them. Let where $(a:) is the standard normal distribution func- 

tion. The combined p-value to test whether the new scheme is better than the 
naive becomes 

2 - 

p-value = 2 $(z), z = — ^ 7 ! \ ( 8 ) 

Vn 

The reason for this computation is simple, because under the null hypothesis, 
the p-value should be uniformly distributed in (0, 1). The inverse z-transform 
make the Zi a standard normal random variable and so is z . The p-value indicates 
how extreme the pi , P 2 , • • ■ , Pn are, when Ho is true. 

4. An Example 

A new method based on the entrainment of the short-term maximum Lya- 
punov exponent {STLmax) has been constructed to predict seizures [4, 5]. It 
has some successes and some failures. The question is whether this method is 
on the right direction toward seizure prediction. Is it better than the optimal 
naive method? 
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Seven epilepsy patients with various number of seizures were examined. We 
examined the inter-arrival times from each patients and found that the inter- 
arrivals could be considered as independent (by sample autocorrelations) with 
non-decreasing hazard rates (by histogram). Thus, it is fair to compare the 
STLmax method with the naive optimal method. Table 14.2 show the results 
from each individual and their joint performance. Though a low significance 
level is not attained for each subject, the overall p-value 0.01 5 computed by (8) 
shows that the performance of STLmax described in [4, 5] is not an artifact. 
We also tested this method by comparing the optimal naive method with the 
periodic and random prediction schemes. They are not significantly different, 
although both method seems to be on the worse side. If they reached an overall 
p-values larger than 0.95, we would conclude that they are indeed worse than 
the optimal naive method at the usual 0.05 significance level testing. 

5. Power Analysis 

To see how powerful this testing scheme is, we choose to use a compound 
Poisson process [10] to generate events. Suppose chemicals are cumulated in a 
tissue and a breakdown (event) occurs when the cximulated chemical excesses 
a threshold (see Fig. 14.4). Suppose the arrival of each chemical follows a 
Poisson process. Without loss of generality, we let the chemical added to the 
tissue to be 1 (unit of chemical) and the mean arrival time is 1 (unit of time). 
Let the time t be 0 after each event. Thus, the cumulated chemical xt at time t 
is k where k is the munber of arrivals between the previous event and t. Let the 
threshold for the event to occur be r, i.e., an event occurs as soon as x{t) > r. 

Suppose the predictor knows this process and the threshold r, but cannot 
measure the chemical level precisely. Let the observed chemical level be 

Ut = Xt + €f, (9) 

where e* is the noise level, assumed to be normally distributed with 0 mean and 
vari 2 mce <7^. Based on this value, the predictor has to make a decision whether 



Table 1 4. 2. p-values of testing the hypothesis that the STLmax is better than the optimal naive 
method. The “Size” row indicates the number of seizures examined. The last two rows are the 
comparisons of the periodic and random prediction methods with the optimal naive method. 



Subject 

Size 


CHAP 

8 


MASH 

15 


FUS 

19 


PRE 

6 


MORG 

8 


MERR 

17 


PHEL 

17 


Overall 

90 


ST Lmax 


0.344 


0.104 


0.190 


0.090 


0.336 


0.102 


0.196 


0.015 


Periodic 


0.344 


0.606 


0.496 


0.332 


0.760 


0.328 


0.378 


0.813 


Random 


0.344 


0.794 


0.778 


0.332 


0.452 


0.574 


0.378 


0.848 
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Figure 14.4. Events generated by a compound Poisson process. Chemical of 1 unit arrives at 
X with a Poisson distribution. An event is triggered when the accumulated level reached t. The 
predictor can make an observation (o) at each A interval. 

to make a prediction and how long ahead if the prediction is to be made. Since 
the expected occurrence time of the next event is at 

T = T-Xt, (10) 

we should predict the next even zXT—yt — a/2, where o is the length of the alert 
period. This prediction value has variance t — xt + cr^ which can be estimated 
by r — j/t + cr^ since xt is not observable. Apparently if r — j/t is large, we 
should not prediction, because the prediction tends to be inaccurate. Whether 
to predict also depends on when we can take the next measurement. We let 
this new parameter be A, i.e., the chemical will be measured every A unit of 
time (see Fig. 14.4). As mentioned before, the optimal decision should depend 
on a given false alarm rate, or a given sensitivity, or costs in (4), but these 
parameters are usually difficult to come by, especially at the research stage. In 
this example, the main purpose is to see whether the observed chemical level 
has any prediction value. We use the following prediction rule: to make a 
prediction if the observed level j/t is higher than j/o and when we predict we set 
the alter period at 

f = e{T-yt-a/2) (11) 

where j/o and 9 are tuning parameters. When y^ = 0 = 0, it is the extreme 
case that we are alert all the time, and j/o = °o is the other extreme case that 
we never make any prediction. These two extreme conditions will produce a 
closed OC curve. 

The power is defined as the probability that this chemical measurement is 
considered as a useful predictor by accepting Hi in (6) by test (7) when n events 
can be observed. We expected that power will increase when n increases or cr^ 
decreases. However, even there is no measurement error (a^ = 0), the power 
is not 1.0 because the prediction error variance r — x* + cr^ is still not 0. 
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Blur standard deviation 

Figure 14.5. Power curve for a scheme with inside information against the optimal naive 
prediction scheme, where n is the number of events in the sample, Blur standard deviation is the 
measurement error of the inside information (see Eq. (9)). 

There are many parameters that affect the power of this prediction scheme. 
Only one simulation study is shown here. In this case, the threshold r is set 
at 20 (imits of time), in other words, the inter-event arrival time has expected 
value 20 with standard = 4.47. Thus there is quite a variation on the event 
occurrence if no additional information is given. The alert period is set at 4 and 
chemical will be examined at A = 2 time imits. The tuning parameter j/o is set 
at 0, 5, 10, and 15 and 9 at 0, 0.5, 1.0 and 1.5. The results are given in Fig. 14.5, 
where n is the event size. The curve shows what we expected; power increases 
as n increase, and as the blur level a decreases. 

This result shows that the optimal naive scheme is difficult to beat with small 
sample sizes. 

6. Concluding Remarks 

1 The key contribution of this research is to provide a rigorous statistical 
test for validating any predicting scheme. The power of this test is also 
examined with a reasonable model. 

2 For the most commonly used exponential arrival, the naive optimal scheme 
is either be alert all the time, or give up prediction if there is no inside 
information. 
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3 For perfect prediction, the sensitivity reaches 1 with no false alarm. The 
area about the curve A is 0. This is, of course, very easy to prove its 
validity except the events occurs periodically with an exact period. In 
this case the naive optimal scheme also has A = 0. 

4 During our simulation experiment, we found that the optimal naive scheme 
can be difficult to beat. For example, if the event arrival is pretty regular, 
then any auxiliary information help very little in prediction. This fact, 
obviously agrees with common sense: When a event comes regularly, 
good prediction scheme are no big deal. Any prediction scheme has to 
prove its validity by a very strong evidence (large sample). 

5 The method developed in this paper does not depend on the cost of the 
false alarm or missing a event. But when the method is mature enough 
for practical use, the costs have to take into consideration. At this stage, 
a test based on given false alarm rate or cost structure should be developed. 



Appendix: Optimal Prediction Strategy for the Next Event 
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Figure 14.A.1. The relation between predictions k, events (down arrows), and alert periods. 

The general prediction scheme can be described as Fig. 14.A.1, where prediction occurs at 
times /i , ^ 2 , • • • before the event occurs at T. When there is no other auxiliary information before 
the the occurrence, the h's are pre-determined. Let Ia{x) be the indicator function with value 
1 if xeA and 0 otherwise. Then the number of detection and false alarms are respectively, 

oo 

i=l 

oo 

u (14.A.1) 

i=l 

where X is the random variable of occurrence time T in Fig. 14.A.1. Taking expectation on 
(14. A. 1), we find the ijj and (j) defined in (1) and (2) with the following expressions. 




rh+a rh+o, rls+O' 

= -1 + / f{x)dx + 2 }{x)dx + 3 / f{x)dx H , 



<t> 
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where f(x) is the density function for X. The optimal allocation of Zi , Z 2 , • • • that minimizes the 
total cost function (6) for general occurrence density function f{x) is not easy to obtain. Some 
special cases such as X is exponentially distributed can be worked our analytically. A special 
case is when the hazard function A(a:) is monotonically nondecreasing. In this case, we may 
derive the optimal prediction strategy in the following manner. 

Suppose we wish to predict that the event will occur at x given it has not occurred before x. 
Then the loss is then 

— 1 rx+a 

irfw/. 

We should make the prediction is the loss is less than than 0. It is equivalent to 



f{x)dx ^ C2 
1 - F{x) “ Cl -f- C 2 ’ 



(14.A.2) 



For small a, the first term of (14.A.2) is approximately aX{x). Thus if A(a;) is a nondecreasing 
function, the attendant should be at constant alert as soon as (14.A.2) is satisfied. If this ap- 
proximation is not good enough, then the condition is if [F(a: + o) — F{x)]/[1 - F{x)] is a 
monotonic increasing function of x. 

A similar result was obtained in [12], but the result seems counter intuitive. In it, there are 
three costs, ci = cost of predicting an event, C 2 = cost of a false alarm and a - cost per unit time 
of maintaining the alert. His result shows that if a = 0 we should be alert all the time. This 
is counter intuitive because with a large overhead cost C 2 , we should also avoid false alarms. 
Our result (14.A.2) are very intuitive for large C 2 (no prediction) or small C 2 (constant alert). 
Moreover, the maintenance cost during the alert period can be easily embedded in C 2 , i.e, a false 
alarm cost is the sum of a initial preparation cost and a maintenance cost for the alert period. 
Thus, the alert cost in [12] seems unnecessary. 
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